Introduction

The citation index appeared first in print edition in the early 1960s. Later, with the development and popularization of computers and networks, the CD and network editions have become the major release forms. The CD edition, especially the network edition, of the citation index plays an even more significant role in scientific research, evaluation and management. Citation indexes have aroused great interest in many countries and now become an important tool for evaluating the research level of a country, a region, an institution and an individual researcher.

Since the Institute for Scientific Information (ISI) published the Science Citation Index, many countries and regions, especially those where English is not the first language, have launched initiatives in proposing citation indexes of their own, in order to improve the research on the analysis of domestic citation index and the level of the construction of domestic citation index resources. The Taiwan Humanities Citation Index (THCI) was built for getting an overall picture of the research in arts and humanities in Taiwan (Chen 2004).

At present, there exist four main citation index systems in China. The first one is the Chinese science and technology paper and citation database (CSTPCD), which was established by the Institute of Scientific and Technical Information of China in 1988. The CSTPCD provides services to help retrieve and analyze the citation index analysis of technical journals. Then the Chinese Science Citation Database (CSCD), which was built by Chinese Academy of Sciences, was published as a citation index system of natural sciences disciplines in 1989. In late 1998, the Chinese Social Sciences Citation Index (CSSCI) project was started up by Chinese Social Sciences Research Evaluation Center. This citation index system mainly functions as one of social sciences journals in China. The fourth one is the Chinese Humanities and Social Sciences Citation Database (CHSSCD), which was developed by Chinese Academy of Social Sciences in 2002.

Although both CSSCI and CHSSCD are designed for assessing the research level of humanities and social sciences, their functions and focuses differ greatly. CHSSCD in CD edition is used simply for statistical analysis of research, while CSSCI, with CD and network editions both available, serves as a standard of academic performance in Chinese universities. Moreover, the retrieval function of CSSCI is more powerful than that of CHSSCD, though the latter has wider coverage of source journals. For example, CSSCI can provide reports on research analysis from different aspects and different databanks, for example, the analysis report of academic norm and discipline activity from the databank of citation literature. However, CHSSCD can only provide basic analysis results based on the statistical data of literature publication information.

The publication of citation index promotes the development and prosperity of research on citation index analysis, and helps improve the structure and function of citation index in a more effective way. For example, Garfield (1979) used the citation index and citation index analysis to establish links among different works or researchers. This has become one of the most widely used methods in bibliometrics. The concept of bibliographic coupling was proposed by Kessler (1963). He argued that the pairs of documents are likely to be similar in some way if they often refer to the same papers, and this should produce a pattern which could reflect recognizable scientific relationships. Bibliographic coupling can be used to identify innovatory research topics and relations between disciplines (Gla ¨nzel and Czerwon 1996). Nowadays, large-scale index data and advanced analysis tools have brought great changes to the citation analysis research, allowing millions of citations to be analyzed simultaneously. More than a few researchers have pointed out the importance of time and space in citation analysis and the necessity of assessing research activities from a national perspective, and some typical alternative citation indexes of non-English journals are described as follows (Archambault et al. 2006; Bordons et al. 2002; Casal et al. 2005).

The Scientific electronic library online was created in 1997 in collaboration between the Latin American and Caribbean Center on Health Science Information and Sao Paulo State Foundation for supporting science research (Meneghini et al. 2006). Then the multi-dimensional statistical analysis for measuring the relationships among academic society journals has been done based on the citation index database of Japanese papers, which was developed by National Institute of Informatics in Japan (Negishi et al. 2004; Su 2001). A similar citation index database in Latin America is Iberindex (Casal et al. 2005). The In-recs is the bibliometric index that provides statistics based on the count of bibliographical citations in order to determine the relevance, influence and scientific impact of Spanish social sciences journals, of the authors who publish papers in them, and of the institutions which are ascribed to (Julia 2008). Érudit database is a digital publishing platform containing Québec scholarly journals, aiming to compare how the addition of local journals improves the measurement of research outputs (Vincent 2011). The Indian Citation Index (ICI), a web-based citation database of India-based or India-focused research journals, is proposed to disseminate results of researches carried out in India (Rabishankar and Anup 2011).

Chinese domestic citation indexes have been used to evaluate the research level of Chinese academics. Moed (2002) used the publications written by Chinese scholars in Web of Science to calculate the degree of research internationalization of China. He used the publications included in CSCD to analyze the growth of academic maturity in domestic journals in China. Yi (2004) discussed some construction problems in using CSTPCD to assess the research outputs in China, such as the selection criteria for source journals, the approach for database architecture building and the keyword choice for each research field.

Functions and features of CSSCI

Compared to natural sciences, humanities and social sciences have their specific features. A phenomenon in disciplines of humanities and social sciences is that a large number of articles are reviews and critical articles, and books in these disciplines have more influence than articles do. This contrasts markedly with the phenomenon in disciplines of nature sciences. Moreover, the cooperation of humanities and social sciences is preferred to propose theory or concept than do experiments together, which is a universal common practice in studies of natural sciences. Therefore, these features need to be considered separately in evaluating research outputs in humanities and social sciences. An interesting problem which is not resolved in citation index analysis area is how to consider the suitability diversity of different disciplines.

The functions of CSSCI designed by us mainly focus on using the citation index to reflect these characteristics of humanities and social sciences. We attempted to provide not only the common retrieval functions in other citation index systems, but also scientific research, scholar assessment and discipline analysis report according to different data fields from source and citation literature (Su 2000). Considering the difficulty and complexity, our team members exerted great effort in the CSSCI system design and data organization. By now, researchers in our lab have published more than 300 papers, ten thematic research reports and five monographs about the topics of citation index analysis, which include four pieces of large academic works (each includes more than 1.5 million Chinese characters). These research outputs show good functions of CSSCI in research assessment and research management of humanities and social sciences and have aroused great feedback in the academic circles.

The following sections mainly introduce the specific functions of CSSCI system respectively, and how these characteristics of CSSCI are used in research analysis. We first introduce the data organization of CSSCI, such as data structure of CSSCI and coding strategies of the citation index, which are the basic elements required to realize the CSSCI functions. And then we explain the application value of using the functions of citation index analysis in CSSCI to evaluate different aspects of humanities and social sciences disciplines.

Structure of the CSSCI system

According to the original design idea, the CSSCI system consists of three subsystems: the data processing subsystem, the information retrieval subsystem and the statistics and analysis subsystem. The major tasks of the data processing subsystem are to input data, clean data, correct errors and merge data so that the data can be used for statistical analysis. The information retrieval subsystem is used to establish a network retrieval platform and burn CD, which provides the retrieval service for users. The statistics and analysis subsystem is expected to perform various kinds of statistical analysis on the basis of CSSCI. The three subsystems are closely related via data streams, and lay the solid foundation for realizing the different functional requirements of CSSCI. Figure 1 shows in detail the functions of the three subsystems.

Fig. 1
figure 1

Structure of CSSCI system

Features of the CSSCI system

As stated above, the purpose of establishing the CSSCI platform is to guide the scientific research of humanities and social sciences, and conduct the evaluation and management activities of these researches in an effective way. With several unique design features, the CSSCI system distinguishes itself from all other citation indexes. It carries data of more useful fields for the purpose of special retrieval and produces a wide range of analysis reports along more different dimensions. It can help search information from the source literature by using more retrieval items such as author, title, organization, funded project, journal, district and keyword. And it can also help search cited documents, cited authors, cited organizations and cited journals from citation literature in CSSCI databases. The distinctive features of CSSCI are described as the following:

Data quality control mechanism

A series of measures for data quality control are adopted in the CSSCI system. For instance, the system is designed to use the automatic fuzzy matching algorithm for selecting records of authors, institutions, journals and article titles with high error possibility. As regards the errors detected by the system and corrected manually, the system will automatically correct them later. Meanwhile, the system is equipped with some specialized dictionaries, such as a dictionary of source journals and a dictionary of institution categories, which are meant for data quality control (Su 2001).

Effective and diversified retrieval approaches

CSSCI provides nearly 20 retrieval items. The logically matched retrieval can be conducted across different data items as well as within one data item. One of the remarkable features of the CSSCI system is that the exclusive searching function can help avoid the vast output of irrelevant results. Take the phrase “中华|人民|共和国” as an example. The symbol “|” represents the correct semantic segmentation in this phrase. Some systems may produce the phrase of “中华人民共和国” as a retrieval result of the search term “华人”, owing to the incorrect segmentation of this phrase as “中|华人|民|共咎国|”. The CSSCI system, however, designs the function called exclusive retrieval to avoid this problem. Likewise, when a user searches by the term ‘民法’, the CSSCI system can automatically avoid producing the result “人民|法院”.

Building data warehouses for citation analysis

Since different types of data are multi-dimensionally related in the citation index, the CSSCI system constructs data warehouses so as to make the citation analysis thorough. For example, the data warehouse of citation literature collects different types of journal citation indicators, which form a multi-dimensional relationship to generate information of comprehensive indicators of academic journals and construct a citation network among them. Another example is the construction of a data warehouse for keyword analysis. By observing the change in the keywords as well as the increase and decrease in the frequency of related keywords over a certain period, a researcher can use this data warehouse to analyze research hotspots and developing trends. By examining the collocation and co-occurrence of keywords in a certain document, a researcher can find the evidence for interdisciplinary overlapping and the new development in one discipline.

Status analysis of research features of humanities and social sciences

One of the special functions of CSSCI is to analyze the research features of humanities and social sciences from the perspective of citation index. Specifically speaking, according to the types of citations, the system can reflect the growth and maturity of a discipline. For example, the discipline which has a high proportion of citing books is a mature discipline, and the discipline which has a high proportion of citing papers is a fast developing discipline. In addition, according to the amount of citations, the system can investigate the research habits and degrees in academic standardization of one scholar. Moreover, according to the types of articles, the system can demonstrate whether scholars in a certain discipline attach importance to academic retrospection and reflection, and whether the academic critique is active.

Other research features in humanities and social sciences, like the highly cited review articles being the landmark of these disciplines, can also be shown on the basis of relative citation index in the CSSCI database. More details can be found in the following section concerning research analysis based on CSSCI.

Data organization and standardization of CSSCI

Structure of the CSSCI database

In order to improve the convenience and efficiency in data processing, statistical analysis and information retrieval, the CSSCI database is divided into nine databanks. The three core databanks are respectively the databank of source literature, the databank of source authors and the databank of citation literature. Other databanks are mainly used for quality control in the phase of data input or for checking data during statistical analysis. The whole data structure of CSSCI is demonstrated in Fig. 2.

Fig. 2
figure 2

The framework of CSSCI data organization

The following paragraphs explain the function and organization of each databank in Fig. 2.

The databank of source authors It is used to record information of the authors in the source literature data warehouse. The reason of building such a databank is to conveniently handle the situation when more than one author appears in a document. Establishing a special databank of source authors can not only save storage space, but also make it convenient for later data processing. The main fields in this data warehouse are the document identification number with the author’s sequence number, name, institution, location of publication, and code of institution category, etc.

The databank of source literature This databank is used to store all the source literature. The main fields are document identification (unique), title, journal name, language, article type, year of publication, volume and issue (number), page number, first number and second number in Chinese library classification (Editorial board of Chinese library classification 2010), keywords, name and code of funded project, etc.

The databank of citation literature All the literature which is cited by source literature is recorded in this databank. The main fields are document identification with citation sequence number, author name, title of literature document, name of journal or publisher, language of literature, category of literature, type of citation, other description fields in literature, year of publication, volume and issue (number), starting and ending page numbers, etc.

The dictionary of source journal It records all the journals included in CSSCI. The main fields are journal name, journal code, publication frequency, sponsor and correspondence address, etc.

The dictionary of institution category This dictionary records the names of the authors’ institutions and the code of institution category, and will be used in generating classification and statistics of an institution, based on the number of publications and citations. The main fields include institution name, institution category code, etc.

The dictionary of district code It records the district code of the place where an author is located. The dictionary is used for working out statistics and analysis of a location, based on the number of publications and citations in the location, and also for information retrieval by location. The key fields in this dictionary include the name of district and the code of district, etc.

The library of journal evolution This library stores the information about name evolution of journals, which are used as a clue in the merging of the statistical results from the searches of the number of citations. The main fields are journal name, the beginning year of publication, the beginning year of the current name, the ending year of the former name and the name used before the current journal name, etc.

The library of institution change This library records information about name evolution of institutions. The trend of universities merging and renaming in the recent 20 years in China brings a great challenge to the correct calculation of the number of paper publications and citations in an institution. This library is built for merging the numbers of paper publications and citations in the same institution which have been related to different names in different periods. The main fields are institution name, change type (merging or renaming), time of merging or renaming, old name (in case of renaming), and the former name (in case of merging), etc.

The library of public dictionaries This library records the codes of different types used in data input, retrieval and analysis, such as type of article, type of language, type of citation, and type of supporting fund and so on. The main fields are type of data (e.g., type of document, type of language), name of content, code of name, etc.

Coding design of CSSCI

We have done a lot of exploration and practice on encoding data in the CSSCI system in order to make the system more efficient. We have successfully achieved a deeper and more exact statistical analysis of the status quo and features of researches in humanities and social sciences via the encoding data in the CSSCI system. We have made several attempts in the following aspects of coding design.

The discipline classification of articles

Each article in the CSSCI database will be classified according to three classification systems, including Chinese library classification (Editorial board of Chinese library classification 2010), the People’s Republic of China National Standard (GB/T 13745-2009) classification and code of disciplines (Standardization Administration of China 2009), and classification standard (Degree Office of the State Council 2011). The aim of adopting these three different classification systems is to examine the development status of Chinese humanities and social sciences from different angles. The Chinese library classification catalogues each book and paper in a sophisticated way, and this classification system is the foundation of the other two classification systems. The classification and code of disciplines system made by Chinese Standard Publishing House is often used to assess the development of each discipline of humanities and social sciences. The classification standard by the Degree office of the state council is published by the Ministry of Education of China and used for issuing the certificate of graduation diploma. Therefore, the citation index analysis from these three different category systems will reflect the characteristics of literature from different aspects.

Encoding of article type and cited literature

CSSCI adopts digitalization to encode different types of articles. For instance, the system encodes the research papers into code ‘10’, the overview articles into code ‘2x’, and the critical articles into code ‘3x’. Then the system assigns some sub-categorical codes to the overview articles. For example, the conference overview is encoded into code ‘21’ and the academic overview into code ‘22’. The academic critical articles are further divided into academic comments (code ‘31’) and book reviews (code ‘32’), etc. The encoding of cited literature types mainly includes the code of the language in the cited literature (Chinese is encoded as ‘01’ and English as ‘02’…) and the code of cited literature types (the journal article is encoded as ‘01’, and the book as ‘02’…), etc.

Encoding of institution category

The CSSCI system divides institutions into eight categories: college and university, research institution, party and government unit, party school and administrative school, the People’s Liberation Army (PLA) system, other institutions not mentioned above, institutions located in Taiwan, Hong Kong and Macao, and international institutions. We use the numbers 1–8 to represent each of these categories. Then the system divides each institution category into several subcategories. For instance, the college and university category is classified into the subcategory of colleges (and universities) affiliated with China Ministry of Education, the subcategory of colleges affiliated with other ministries, the subcategory of colleges affiliated with provinces and the subcategory of other colleges. These subcategories are encoded into the numbers 11, 12, 13 and 19, respectively. As for the colleges and universities in the third layer, the encoding will be further subdivided and be represented by the symbol 1 × 1, 1 × 2, 1 × 3, 1 × 4, etc. For example, the comprehensive universities will be represented by the code 1 × 1, professional colleges will be represented by the code 1 × 2, normal universities will be represented by the code 1 × 3, and teacher training colleges will be represented by the code 1 × 4, etc. After encoding each category, we can conduct a comparative analysis of research achievements according to the categories of college and university.

Encoding of region

The encoding of region consists of six numbers. The criterion for division is mainly based on the administrative region. The first two numbers indicate the province, the third and fourth numbers represent the prefecture-level city (each provincial capital is encoded as ‘xx01xx’), and the fifth and sixth numbers stand for the county. After each region is encoded, we can do statistical analysis with respect to a whole area (e.g., statistics on a province) or to different districts (e.g., statistics on cities in the same province or cities in different provinces). It also provides an effective approach for comparison of statistics of similar cities (e.g., comparison of provincial capitals).

Encoding of funded project

CSSCI not only stores the contents of the funded project (project code, project name, etc.) but also encodes the types of the funded project, including the national foundation fund, the foundation fund of the Ministry of Education, the foundation fund of other ministries, the province-level or city-level foundation fund and other funded project respectively as sequence numbers 1, 2, 3, 4, 5 and so on. The system then further divides the types of funded projects into the social sciences foundation funded project, natural sciences foundation funded project, national 863 funded project, national 973 funded project and so on. Still a further subdivision includes major projects, key projects, ordinary projects and national progress plan projects, and each project type is symbolized with numbers.

The encoding of such data helps improve the statistic analysis of CSSCI in a more effective, convenient and careful way. It also enhances the depth and width of statistic analysis. In the following section we will discuss in detail some of our preliminary research practices in applying the CSSCI system.

Research analysis based on CSSCI

The citation records in academic papers contain much useful information, which can not only reflect the relationship between documents, but also connect a series of documents into a citation chain to help us trace the evolution of a piece of knowledge, or even the formation of a new discipline. The citation index can help us to analyze the research features of a discipline and explore research hotspots and developing trends. In addition, an interdisciplinary examination of such information can help us mine the intersecting points between different disciplines as well as new research fields within a discipline.

Analysis of research features of a discipline

As early as we began to develop the CSSCI system, we took into consideration the issue about how to make use of citation index analysis to identify research characteristics of a discipline. We conducted a thorough analysis of the cataloging items of the citation index, especially the data in the citation documents which reflect discipline features. After several pilot tests and discussions with experts in different disciplines, we drew the conclusion that it was feasible to objectively assess the research features of a discipline by means of citation index analysis.

Analysis of academic norm and vitality of a discipline

By calculating the quantity of citation documents in each discipline and comparing the numbers among different disciplines, we can find the differences between disciplines. As a matter of fact, such differences reflect levels of academic norm, research habit, research depth, research attitude, research style and academic morality of each research group. Take the history discipline as an example. History study aims to seek truth from facts. Thus, research in this discipline requires a lot of references and analyses of diverse viewpoints. The quantity of citations can reflect this requirement, as the average number of citations per article is approximately 20. By comparison, the average number of citations per article is only 4–5 in the discipline of journalism and communication. This difference is not only determined by research objects but also by the differences in degrees of academic rigour between these two disciplines. Full details of citation difference in each discipline are available from the references provided by the present authors (Su 2007a, 2011). We argue that any academic research is based on previous studies. Without citing or referring to others’ research achievements, one can not make any progress in his or her own academic studies. In general, the greater depth a research reaches the more reference documents it may probably will have to cite. And the higher degree the overall academic norms are on, the more reference documents the achievement will cite.

Although many factors may effect the change in the amount of citations, the discipline rigour can be reflected by the citation amount. In China, academic journals usually do not place a definite limitation on the amount of citations in an article. Thus, the citation amount can describe the situation of being cited concerning different kinds of literature (e.g., overview and original articles). The present authors once carried out an in-depth analysis of the citation amount in different disciplines and compared the discipline rigour reflected by the change in the citation amount among different disciplines. For instance, in Su’s author academic work (Su 2011), the statistic results of each discipline provide firm support for the author’s conclusion.

According to our comparison of statistics of annual document citations, the average number of citations per article was only 4.6 in 2000 (Su 2007b), but it went up to 13 in 2011. The increase in the average number of citations per article indicates a good direction of the development of research norms in Chinese humanities and social sciences, because the scholars in this research fields began to pay more attention to former research outputs and form a better habit of doing current research. More details are included in the author academic work—A report on the academic impact of Chinese in the humanities and social sciences (Su 2011). In a word, the greater the overall quantity of citation documents in a discipline is, the better academic style and the more precision of research work a discipline may have.

Furthermore, the statistical data about different types of academic documents of a discipline can reflect some of its research features. For example, the large number of critical papers in a discipline shows that the academic criticism enjoys good attention and that the discipline is academically active. Review papers can help scholars in a discipline to conduct retrospection and reflection. These two types of papers play an important role in ensuring the sound development of a discipline of humanities and social sciences. According to our statistic analysis, the number of critical papers is very limited in many of the disciplines of humanities and social sciences in China (Su 2011).

Different from natural sciences, humanities and social sciences incorporate features of reviews and overviews. To be more specific, research in natural sciences usually relies on scientific experiments while research in social sciences is often based on thoughts. So it’s easy to understand why reviews and overviews are more needed in humanities and social sciences. Therefore, due to the social nature, humanities and social sciences should adopt the policy of “let a hundred schools of thought contend”. The analysis of the review articles is of profound significance to the assessment of the active degree a discipline. This article statistically analyzes the distribution of the review articles in academic journals in China.

This phenomenon may result from the following two factors:

  1. 1)

    The journal presses are afraid of being involved in controversies or disputes so that they avoid publishing critical papers.

  2. 2)

    The academic atmosphere nowadays is reminding scholars not to offend others or get into troubles. Thus there appears a superficial harmony in the academic circles in China. By comparing the numbers of critical and review papers published among different disciplines, we come to realize deeply the great importance of these two kinds of papers in promoting the academic activities and the sound development of a discipline.

Analysis of the international level of discipline research

The international level of discipline research is mainly displayed in two aspects: on the one hand, the degree is indicated by whether there are many publications abroad from scholars in this discipline. On the other hand, the degree is decided by whether the academic research in this discipline is in close contact with the corresponding research all over the world. Then we can calculate and analyze the international level of a discipline from the perspective of analyzing the citation index. The analysis of the language branch of cited documents can reflect in a way the connection of a discipline with the corresponding foreign research, especially the introduction of foreign research achievements, research ideas and research methods. The analysis of the language branch can also reflect a scholar’s general language proficiency and the scope of obtaining academic resources. In general, the more foreign literature the academic papers in a discipline cite, the faster the research in such a discipline may develop, and the closer its connection with foreign research may be in.

After comparing and analyzing the quantity of the citation documents in each discipline, excluding those disciplines whose research objects are mainly native (e.g., Chinese literature, history, archaeology, ethnology, linguistics) and those disciplines whose research objects are only foreign (e.g., foreign literature and foreign linguistics), it is found out that disciplines like Marxism, philosophy and politics have a high citation rate of translation works. This indicates that some famous foreign works translated in the earlier time have great influence upon the development of these disciplines in China. Disciplines like psychology, management and economics have a high citation rate of foreign literature, which indicates that these disciplines refer to a lot of foreign achievements, or that researches in these disciplines are frequently connected to the corresponding studies abroad. However, it is noteworthy that disciplines like journalism and library and information science display a relatively low citation rate of foreign documents, though the corresponding studies abroad do not show much geographical difference. It is suggested that this phenomenon be given serious attention for the future development of these disciplines.

Analysis of the academic maturity and growth of a discipline

In general, the achievements published in books are more mature than those published in academic papers in humanities and social sciences. But the short publishing period of papers makes it possible for the instant reporting of the authors’ latest research results. Therefore, we can observe the maturity and novelty of a discipline by means of surveying the citation document types. Generally speaking, new disciplines and fast-growing disciplines cite more journal papers and reports than books, while the citation proportion of old and mature disciplines is the other way round. For example, natural sciences generally develop rapidly, and more than 70 % citation documents in this area are journal papers. In contrast, 55 % of citation documents in humanities and social sciences are academic books (Su 2007a, 2011).

Based on the comparison and analysis results, it can be concluded that the disciplines of humanities are more likely to cite books (basically over 70 % citations are from books), and the disciplines of social sciences are more likely to cite research papers (basically over 60 % are from research papers). This conclusion (Su 2007b, 2011) exactly demonstrates that the discipline maturity of humanities is higher than that of social sciences, but that the development speed and research activity are lower than those of social sciences. It is also found that a certain discipline cites a very high proportion of network information, even up to 20 %. Although this phenomenon can reflect the high utilization ratio of the latest information in this discipline, the research precision of this discipline can be doubtful. We should raise the alarm on this discipline because the reliability of network information is questionable and one can cite such information without tracing the original text and thus form the habit of indolence.

Important research papers and academic works

In the past, the list of the papers or works that had great academic influence was often recommended by experts in a discipline. The personal bias and scope of knowledge of an expert usually limit the comprehensiveness and objectivity of the recommended list. The citation index, which draws data from the citations of academic authors, can be used as a supplement to the current recommendation convention.

Analysis of important research papers

The number of cited documents reflects which paper has been paid attention to by peer researchers, and which paper plays an important role in that field. 5 years after the CSSCI database came into being, we began such kind of researches. It is found that some classic papers are still cited by a large number of scholars after 20 years, 30 years or even several decades of their publications. These classic papers still play a more important role in the related research field. Take the paper “Theory of The Firm: Managerial Behavior, Agency Costs and Ownership Structure,” as an example. This paper was published by Michael C. Jense, a master scholar in both economics and corporate finance and management, in Journal of Financial Economics in 1976. It had been twice on top position in our citation analysis report about academic papers in economics (Su 2007a, 2011). The citation number of this paper still grows gradually, from 10 times in the year 2000 to more than 110 times in the year 2006. Through citation analysis we also discover another interesting phenomenon: some influential papers published by foreign scholars still have great academic influence in Chinese academia even after a decade or several decades, but the corresponding influential papers published by Chinese scholars will gradually decrease its influence after 5 years of their publications.

The above analysis results indicate a serious problem in Chinese humanities and social sciences research. The current research evaluation system focuses more on the number of research achievements. The academic circles urgently call for some top-quality academic works which can stand the test of time and exert great academic influence.

Analysis of important academic books

The influence of a book is usually higher than that of a paper in humanities and social sciences, and sometimes papers published by a scholar are collected to form a book. Therefore, it is of great significance to discuss the academic influence of books in humanities and social sciences. The objectivity of the features of the citation index helps us to identify important academic books of a scholar. In 2007, we started up the research on “the academic impact of Chinese books in the humanities and social sciences,” and published a groundbreaking academic report (Su 2007a)—A report on the academic impact of Chinese in the humanities and social sciences, with 1.58 million Chinese characters. We recommended 3,410 books which had exerted significant influence on the development of Chinese humanities and social sciences based on citation index statistics and analysis. Our work received much positive feedback from the academic circles and was considered as a basic and pioneering research on humanities and social sciences. This report is used as a very effective reference list for reading and collecting academic books in research institutions and libraries. Moreover, this report is the first large academic work on book evaluation in humanities and social sciences, and it can attract more attention to academic book publication on the part of presses and promote the prosperity and development of disciplines in humanities and social sciences.

Based on this report, much valuable information about research features of Chinese humanities and social sciences has been detected as follows:

  1. 1)

    Works from statesmen (or leaders) play a very important role in the development of humanities and social sciences in China. This is also regarded as a distinctive feature different from the development of humanities and social sciences in other countries.

  2. 2)

    In disciplines like philosophy, politics, economics and management science, classic foreign works have a deep influence on the corresponding research in China.

  3. 3)

    Chinese scholars show a serious lack of ability in reading the original versions of foreign works. They read more translation versions instead.

  4. 4)

    In the recent 30 years, the disciplines of humanities and social sciences have witnessed a lack of classic works publication. As regards research into humanities, most of the works that enjoyed high citation rate were published 30 years ago. For social sciences, the most highly cited works were published at the turn of the 21st century.

Analysis of research hotspots and developing trends in a discipline

The citation index is an effective tool to examine the research hotspots and developing trends in a discipline. The CSSCI database has been utilized in the analysis of research hotspots and developing trends in Chinese humanities and social sciences and achieved the desired results.

Analysis of research hotspots in a discipline

The idea of using CSSCI to identify the research hotspots in a discipline is described as follows: we firstly calculate the frequency of the keywords in each document, and then pinpoint the research field which involves the high frequency keywords as the research hotspots. In order to avoid the fortuity brought by only using 1 year data, we usually count the keywords obtained from the period of 3–5 years. In addition, we attach importance to the analysis of disciplinary attributes shown by the keywords. Meanwhile, we also examine the classification of disciplines represented by keywords so as to detect the research fields of interdisciplines. All the above methods guarantee the effective determination of research hotspots.

Analysis of developing trends in a discipline

In the past the citation index was rarely used to indicate the developing trends in a discipline. We attempt to use the CSSCI system to do this kind of study, i.e., to suggest the developing trends of each discipline in the future years. The methodology we adopt is described as follows:

  1. 1)

    Build a data warehouse of keywords, including keywords extracted from CSSCI source documents, discipline classification with indexed keywords, publication date and half-life in each research field (to be calculated according to the data analysis).

  2. 2)

    Establish the multidimensional relation between these data.

The data warehouse of keywords and multidimensional relations lays the data foundation for the analysis of developing trends in a discipline.

The specific analysis algorithm is expounded as follows. Firstly, we put keywords in a queue according to their annual quantity variation, and extract the keywords whose frequency gradually increases. Secondly, we record the year when the keyword first appeared in documents, and calculate whether the period from the first year of appearance till the present year is longer than the half-life period of the research field which includes this keyword. If the calculation result is less than the half-life period, and the number of appearance of the keywords in the recent years is up to a certain quantity, we claim that the research field including these keywords is an important hot research area. If the calculation result is less than the half-life period, but the appearance number of keywords is still at a distance with a certain quantity, we consider the research field including the keywords as a potential hot spot. If the calculation result is equal to the half-life, it shows that this research field is in its peak and may leave the hotspot zone after 1 or 2 years. And if the calculation result is greater than the half-life, it indicates that the research field is no longer a hot spot. This algorithm has been proved to be effective. Here is one of our previous studies: in 2004, we tried to identify the developing trends in library and information science. The results show that the research field of the ‘library automation’ topic would not be a hotspot, the research field of ‘digital library’ would remain as a hotspot in the future 10 years, and the research fields concerning ‘ontology’, ‘semantic web’, and ‘webmetrics’ would become potential hotspots in the future. These predictions are nearly consistent with the later developing trends since 2004.

Establishment of academic network

Academic network is a useful way to discover certain features or regularities among academic researches. Specifically speaking, we can build an academic network to explore the relationships between different research topics, and use the results to discover the relationships between different disciplines. It also has practical significance in improving the collaboration and connection between scholars, in promoting the development of scientific research and in recommending academic resources. We argue that the citation index is one of the best tools or platforms to explore the academic network.

We have already used the CSSCI database to do lots of researches on academic network in the past 10 years. For instance, we establish a citation network of authors via using the interactive citation relationships among authors, and then we identify the core scholars and scholar groups in each research field. Another example is that we establish a citation network of journals via using the citation relationships among journals, and then we pinpoint the interactive citation relationships among journals in the same or different disciplines. Moreover, we find out that some journals violate academic norms such as interactive citation based on financial benefit, journal union of circular citation and so on. In order to describe the status quo of collaborative research between institutions and authors, we also build networks of author collaboration, institution collaboration, and district collaboration on the basis of the co-author information in documents. Based on these collaboration networks we can easily identify the cooperative research groups and cross-institution research groups, etc.

Conclusion

At its birth, the citation index is used to build up the relationships between documents and trace the source and evolution of science. Later, it is gradually adopted to perform the function of evaluating scientific researches. In China, the citation index is now more considered as an evaluation tool. Thus, it is necessary to clarify the primary functions of the citation index. Firstly, it can be used to explore the scientific regularity and promote the development of scientific research. Secondly, it can be used for collecting citation data to assess academic research achievements. Thirdly, it can help us analyze the research features of each discipline and guide the planning management. These three functions are our original aims to design the CSSCI system.

We have carried out a lot of researches on the basis of CSSCI. The aim of writing this paper is to draw more academic attention to the academic value of the citation index. It is hoped that CSSCI can be made good use of for exploring the features of academic research, thus promoting the prosperity and sound development of Chinese humanities and social sciences.