1 Introduction

When the principles and beliefs of MANETs [1] are implemented in the field of vehicles, they generate a new domain named Vehicular Ad hoc Networks, commonly known as VANETs. The term VANETs is generally a network made up of a group of vehicles interacting with each other continuously [2, 3]. Improving and enhancing road safety and security are an important goal for establishing such types of networks. A VANETs design aims to permit the interaction between close-by vehicles and roadside units, prompting the establishment of three possibilities:

  • Vehicle-to-Vehicle (V2V) network: It permits straight interaction between two vehicles without depending on the nearby infrastructure.

  • Vehicle-to-Infrastructure (V2I) network: It permits a vehicle to interact directly with a road-side unit; this type of interaction is essential for the collection of data and the information.

  • Hybrid structure: This structure is a combination of both the V2V and V2I structures. In such a situation, communication between a vehicle and the road-side unit occurs in the single-hop or the multi-hop pattern, totally depending upon the distance.

Despite being an exceptional instance of a Mobile Ad hoc Network (MANET), a VANET has some specific highlights, including low transfer speed, small transmission range, and many more. Although there are many similarities between these two classes, but still VANETs are different from MANETs in several aspects, as summarized in Table 1 [4,5,6,7]. Some main features are highlighted below to describe the dissimilarities of VANETs from MANETs.

Table 1 Comparison and contrast between the VANETs and MANETs [8]
  1. 1

    High dynamic topology: A vehicular system is exceptionally dynamic because of wireless propagation and the velocity of vehicles. Most vehicles have a high relative speed of nearly 50 km/h in urban ecosystems and the speed of nearly 100 km/h for the highways. Also, they might follow diverse directions. So, vehicles can instantly enter or exit an Ad hoc network several times in a short period, resulting in the quick and frequently changing topology of the communication network.

  2. 2

    Disconnecting frequently: The connectivity is frequently changing due to its hugely dynamic topology, so the link between two vehicles can suddenly disappear even during data transmission.

  3. 3

    Communication geographically: The communication in a VANET depends upon the geographical area where the vehicles are traveling. It is entirely different from other channels in which a targeted vehicle or a collection of targeted vehicles are specified through an ID or by a group-ID.

  4. 4

    Easy prediction and movement: Even though VANETs present the high dynamic topology feature, vehicles usually ensure a mobility model restrained by paths, lanes, roadways, speed, traffic restrictions, traffic lights, and driver’s driving practices. So, it is plausible to predict the future status of a vehicle through the mobility pattern.

  5. 5

    Propagation design: VANETs work in three scenarios — city, rural, and high way. In a city scenario, communication is made more complicated by its surroundings due to houses, trees, and different things, and those are acting as barriers to the propagation of signals and changing vehicle density. In rural scenes, it is important to consider the attenuation of the signal distribution and the signal reflection due to irregular patterns of mountains, slopes, thick forests, and many more, commonly known as topographic forms. In the last scenario, that is the highway, a propagation design is typically regarded as a vacant area, but the signal can still be affected by a reflection of hoardings on the highway.

1.1 Related surveys

This subsection contains a brief explanation regarding the literature reviews collected from the quality sources. It presents a comprehensive review of the survey papers for several topics covered under the field of VANETs [9, 10]. Understanding the concept related to congestion in VANETs and how congestion is controlled in VANETs is the prime focus that can be easily understood from the past literature review. Many surveys have been conducted in the field of Vehicular Ad hoc Networks concerning various issues such as security, routing protocols, congestion, road accidents, and many more.

Hayder et al. [11] (2019) introduce a unique, effective centralized simulated annealing-dependent strategy for obtaining the best path for the vehicle by applying a VIKOR kind of cost function. To determine the best route for the vehicles, the parameters used by this strategy are traffic speed, the density of vehicle, and width of the road. Jinglin et al. [12] (2019) projected a novel system which was known as traffic prediction, allowed double rewarded Value Iteration Network (VIN) for the planning of avaricious paths. Chang et al. [13] (2018) proposed an efficient real-time traffic data-sharing mechanism based upon a scattered transportation arrangement with the RSUs that has lower computing complexity. Kashif et al. [14] (2018) projected a novel scheme named as Dynamic Congestion Control Scheme (DCCS). This mentioned approach was dependent on traffic jam recognition and congestion calculating methods. Oliveira et al. [15] (2017) projected a new protocol named Adaptive Data Dissemination Protocol (ADDP). This protocol effectively provides the distribution of reliable information. Feukeu et al. [16] (2017) proposed a Dynamic Broadcast Storm Mitigation Algorithm (DBSMA) to resist the broadcast storm problem in a vehicular environment. And the DBSMA strategy is straightforward to implement. Fallah et al. [17] (2016) examine two algorithms that are used for controlling the range of the network in terms of time durability and spatial equity. Hamida et al. [18] (2015) evaluated, examined, and analyzed various known attacks and enlightened their privacy and safety specifications that should meet the conditions for VANETs. Engoulou et al. [19] (2014) investigated modern recommended vehicular safety architectures, including standardized routing protocols. Mejri et al. [20] (2014) examined and compared various recommended cryptographic strategies and estimated their performance and capability.

1.2 Data congestion in Vehicular Ad hoc Networks

Generally, data congestion occurs in VANETs when many messages are transmitted by several vehicles simultaneously [21, 22]. All the vehicles are continuously broadcasting the messages to nearby vehicles without knowing the needs of those messages. So, it leads to the scenario of congesting the network with a large number of unnecessary messages. In a fully connected VANETs environment, traffic congestion leads to road congestion because emergency messages are not delivered on time, and communications between vehicles are also affected [23]. Therefore, data congestion is considered an important cause of road accidents.

Reason for focusing on Data Congestion in VANETs According to the World Health Organization (WHO), highway traffic death has touched around 1.35 million per year due to delays in delivering emergency messages and interruption in continuous communication among vehicle to vehicle and vehicle to infrastructure. If proper communication has been established among the vehicles, it is easy to regulate the traffic, reduce traffic jams, and provide up-to-date information to the drivers [6, 24]. Moreover, most of the accidents would be avoided by controlling the data congestion occurring in the VANETs environment [25]. Thus, it is vital to identify the current status of research conducted in the data congestion field of VANETs. A research analysis is required to accomplish this aim, as the publication numbers are substantially increasing since 2009. However, a very negligible impact is confirmed by the reports of the Scopus database [26]. It is necessary to write a detailed bibliometric analysis paper for finding the research gaps from various research articles that have already been published.

Fig. 1
figure 1

Number of publications per year for the topic “Data Congestion in VANETs”

1.3 Bibliometric analysis

Bibliometric analyses are defined as an important kind of research work that analyzes quantitative traits and features of a particular research area [27, 28]. By giving a glimpse of meta-aspect [29,30,31] in the literature, this article explores the emerging trends and improvements [32, 33] in the particular research area, that is, data congestion in VANETs. Investigations are conducted to analyze the potency of the defined research area. Thus, examining the collection of patterns is plausible according to the count of frequency and the genre of citations [34, 35]. Moreover, it analyzes literature based on quality, knowledge sharing, affiliation, and the financial influence of the research [29, 32, 33, 36, 37]. Furthermore, bibliometric examinations also review improving or reducing the influence of research organizations [34].

At the time of examining the literature, we conducted a very confined piece of work in VANETs. In [38], the efficiency of authors and contributing countries was examined by considering more than 550 publications that were wholly associated with the field of VANETs (1992–2011). The authors in [39] investigated the key terms associated with the subject topics, including a wide range of viewpoints in the designs of publication, research influence, and effectiveness. In [7, 40], the authors examine the rise in the domain of VANETs and improvements over the last several years. Many research articles also analyzed the rise in the data congestion topic in the field of VANETs. It fails to provide the latest research trends, as there are limited research publications in the data congestion domain. Bibliometric analysis performs a vital role in estimating algorithms’ productivity or analyzing the key terms.

For presenting the new insights, our goal is to illustrate the current status of the data congestion topic in a consistent time span [41]. The number of annual research publications related to data congestion in VANETs has steadily grown since 2010, as illustrated in Fig. 1. The search result of Google Scholar is almost doubled every year since 2009. A similar rise is further predicted by the Scopus (scientific literature portal) database. In 2010, Scopus recorded 689 publications related to VANETs. As of December 31, 2019, Scopus records 11,109 VANETs-related publications and 434 research publications of data congestion in VANETs. The data congestion of VANETs is yet to be extensively explored, suggesting that more research works should be carried out for better and more realistic results. As per the best of our knowledge and information, bibliometric analysis for the data congestion in VANETs is an under-explored area.

In this paper, we examine a total number of 11,109 publications associated with the field of VANETs, including publications for the domain of data congestion in VANETs. The publications are extracted from the Scopus database between 2010 and 2019. We outline and empirically investigate the peer-reviewed research publications.

The goals of this article are listed as follows:

  1. i

    Patterns of publishing such as co-authors, affiliations, and so on,

  2. ii

    Investigation of popular keywords,

  3. iii

    Keywords combining to recognize an area of interest.

In addition, a few more items of knowledge are summarized, including

  1. i

    Patterns of a citation for journals,

  2. ii

    Outlets for the publications, and

  3. iii

    Affiliations.

Moreover, this paper displays information from the perspective of development, status, and modern trends [42,43,44,45,46].

The remaining of this article is organized as follows. Section 2 briefly illustrates the research methodology implemented for the bibliometric analysis. Section 3 recognizes the modern trend of VANETs research based on keyword groups. In Sect. 4, the research patterns are examined and further investigated to understand the overall composition through different aspects. Subsequently, the influence, and the efficiency of VANETs research, is explored in Sect. 5. Section 6 applies the VOSviewer tool to illustrates the relationship between different research aspects. Section 7 discusses the subject of future scope and open challenges. Lastly, Section 8 presents the conclusion.

2 Research methodology

The bibliometric analysis is a well-organized and effective technique for examining the patterns of publications through an academic literature database [47] that is used to investigate any area [48]. Bibliometric analyses are used by various authors to discover the relationships between their topic and their field. Moreover, this type of analysis offers a comprehensive view of the research area [49]. The bibliometric analysis is conducted on different topics to pinpoint the scope of a specific topic [50, 51]. However, no thorough investigation has been published in the area of data congestion occurring in VANETs at a bibliometric level. Due to this reason, we carried out the bibliometric analysis for the topic of data congestion in the domain of VANETs during the last decade (2010–2019).

2.1 Objective of the study

This study aims to obtain a scope of data congestion topic in the field of VANETs. In this paper, the Scopus database’s bibliometric analysis has been conducted by retrieving the accurate research-related data for the last decade on the topic of data congestion in Vehicular Ad hoc Networks. This analysis permits us to gain insights into the topic and forecast the scope of that area in the future. This analysis helps researchers envision the research data contributed by different authors, countries, and organizations. Moreover, this analysis also visualizes the number of research publications done over several years, the number of documents published by each author, trends of the topic in a particular field, and many more.

2.2 Pre-processing of the research data

The initial step consists of data collection followed by pre-processing, also known as data cleaning. Gathering of data involves the details regarding the publications from the reputed research scholar portals [31].

2.3 Plan regarding the Bibliographic analyses

Elsevier’s Scopus database is used for searching, aggregating, and processing the research articles. This database has been chosen for the selected specific domain due to the following two reasons:

  • Scopus catches almost twice articles more than Web of Science (WoS) and many more times than other databases [33].

  • Scopus also provides advanced functionality for exporting the ordered data that involves abstracts, references, key terms, and many more.

We had chosen the Scopus database for our paper due to the reason that Scopus is a large interdisciplinary database from Elsevier, with particular strengths in science and technology. The bibliometric and citation features use the whole of the Scopus database. Scopus has a larger dataset, so more articles, journals, and conference papers will have metrics. Also, Scopus claims to be the largest abstract and citation database of research literature and quality web sources. In future, we will use the WoS database using some relevant tools and techniques.

2.3.1 Trends in the research

To incorporate an extended publication in VANETs field, a general research query “(TITLE-ABS- KEY(Vehicular Ad hoc networks))” was used for "title, abstract, and key terms." This query returns the result of 11,109 articles during an investigation time period of 2010-2019 (as of December 31, 2019). A procedure for the cleansing of data was conducted to identify the articles for the entire and actual data. Ultimately, 434 publications are retained, which were precisely associated with the topic “Data Congestion in VANETs.” The information related to publications obtained through all this process was examined and segmented from various aspects.

Research trends are identified by using the seven steps listed below:

  1. 1.

    Initially, the articles for the selected topic are reviewed to the access type and document type.

  2. 2.

    Followed by the total number of articles published each year.

  3. 3.

    The investigation further involves the keywords used in the topic.

  4. 4.

    The count of articles published by the authors.

  5. 5.

    The total number of research articles got published by various countries.

  6. 6.

    The number of documents published for various source titles.

  7. 7.

    Lastly, the number of publications done under various subject areas.

2.3.2 Research contributions

Research scholars are working very hard in all the fields to add new information to the existing database. Research articles are treated as the prime source of information for the research conducted in a specific domain. Each article has excellent analysis intentions that determine the contribution given by that article. For instance, research articles from the physics domain may suggest a scientific illustration for showing the effects of light. Likewise, articles published in the electrical engineering domain can introduce a new design for various electrical wire’s connections. There are infinite particular contributions in the field of research. However, the contributions with unique facts and creative scenarios are rare. In this paper, different contributions in the form of theoretical contribution, country-wise contribution, authors count contribution, and many more such contributions have been discussed.

2.3.3 Productivity in the research

Productivity is defined as a symbol of performance in any production operation. For bibliometric research, productivity can be determined based on two indicators: (i) researcher’s publication count and (ii) research publications influence.

3 Analysis of the research trend

It is necessary to organize and aggregate the bibliographic data by examining key phrases and groups to analyze the key topics and their features [52]. Key phrases describe the research content that is relevant to a particular area. Key phrases also discover vital ideas, including the features of discrete research contributions. The new research area’s growth can be quickly identified through the recurrence and frequency of key phrases for a specific area in a particular period. It is more convincing to identify areas or aspects completely connected by analyzing the appearance of key titles.

3.1 Analysis of keywords

While examining the literature, 73,233 distinguishing key phrases were identified for the query: “(TITLE-ABS-KEY(Vehicular ad hoc networks) AND PUBYEAR > 2009 AND PUBYEAR < 2020).” And 3,689 key terms were identified for the query: “(TITLE-ABS-KEY(Data Congestion in Vehicular ad hoc networks) AND PUBYEAR > 2009 AND PUBYEAR < 2020).” These results include many key phrases that are not frequently chosen by authors.

Fig. 2
figure 2

Frequent keywords used by various authors in the literature

The standard distribution used for each publication’s key phrases is three, four, or maximal six key phrases that are often used to record the publication. However, for the data congestion in VANETs, recurrence, and the key phrase count, is higher than the usual judgment defined. The main aim is to reduce the disparity of key phrases. Keywords with the high occurrence used in the topic “Data congestion in VANETs” are displayed in Fig. 2.

Results represent modern analysis is concentrated on the popular keywords such as Vehicular Ad hoc Networks (9181), Vehicles (3042), VANET (2797), Vehicular Ad hoc Networks (VANETs) (2143), and many more. In Fig. 2, keywords are classified into six categories named as Communication, Computing, Congestion, Data congestion, Routing, and Security. In our research topic that is “Data Congestion in VANETs,” the most frequently used keywords by the various authors are Vehicles (3042), followed by VANET (2797), then Vehicle to Vehicle Communication (1752), and many more. Figure 2 shows that most of the authors used standard key terms for referring to their research work. The literature reveals that investigations in these domains are still ongoing [43, 53,54,55,56].

4 Name of the author and affiliations

This section examines a rise in the topic data congestion in VANETs across many academic institutions, contributing fields, authors, patterns of citation, and outlets.

4.1 Analyzing based on educational disciplines and participating country

As per the outlets of publication, every article was categorized into multiple areas of the subject. Subsequently, the specific paper classification was made as per the knowledge conceived through the Scopus database, as summarized in Table 2.

Table 2 Year-wise (%age) publications subject areas

We analyzed the percentage of contributions done in the topic data congestion in VANETs across several disciplines (Table 2): (i) contribution in Computer science and Engineering field is increasing efficiently from the last few years, (ii) improvement in congestion control algorithms and methodology used in different fields are in an initial state, and (iii) a zenith of popularity, and an absence of expectation, can be recognized in business-related fields. The information illustrates the main contributions are associated with the computer science and engineering area [57]. Hence, research booms in its discipline; for instance, data congestion in VANETs belongs to the computer science and engineering field, so the maximum contribution is made only in this subject area.

Fig. 3
figure 3

Number of documents published by different countries

As intimated in reports and surveys [2, 58,59,60,61,62,63], there is an up-and-coming trend for the high adoption of VANETs field in the future. Moreover, this field will prosper in the future, including the growth in data congestion control algorithms. Further, social sciences and business-related subject areas will develop with a particular aim to know and understand the assumptions of VANETs from other viewpoints.

Fig. 4
figure 4

Co-authorship distribution

To gain insights through the contribution designs, this section analyzes research publications as per the country-wise. As shown in Fig. 3, India has 87 documents and leads among other countries to publish more research articles from the last ten years, followed by China with 72 documents. The graph depicts a broad summary of the highest contributing countries for the research topic “Data Congestion in Vehicular Ad hoc Networks” domain.

4.2 Number of authors impact per research publication

On average, the total number of authors considered for each research publication (from the last decade) is displayed in Fig. 4. It represents significant publications that are having one, two, and even three co-authors. A research contribution done by an individual author comes out to be less (9.28%) in contrast to the five author’s offerings (11.55%), as depicted in Fig. 4. It appears that data congestion in VANETs is more active toward collaboration in contradiction to the contribution done by an individual author. More authors involved in the publication are expected to have a validation benefit than individual researchers.

Table 3 Referencing patterns (reliable number of references required to have maximum citations)
Table 4 Publications count through the document type
Fig. 5
figure 5

Document type (conference papers act as a main research contributor in both scenarios)

4.3 The referencing average count

Table 3 intimates the design of publications having at least one citation (\(n = 64\)). A third line represents an average calculation of references (f) as per the article’s citations. The total recognized publications were indicated by (n); for example, the publication with at least 101 citations has 126.20 references on average. It helps recognize genuine and reliable research quality publications over the decade [64]. These outcomes are based on comprehensive research; most of the journals confine the total page limit for the research article, impacting the entire number of references.

4.4 The outlet for publication

The collection of an outlet impacts the clarity and the influence of an article. This part summarizes outlet, (intended for research) favors conveying their ideas and experience to the research community. The information gathered contains metadata concerning the document, such as conference paper, article, book chapter, and many more that must be examined. Table 4 demonstrates the data for document type in depth for the topic “Data Congestion in Vehicular Ad hoc Networks.” Moreover, Fig. 5 illuminates the document type for the two domains: VANETs and Data Congestion in VANETs. Figure 5 depicts the estimated number for each document type that has been collected from the reliable Scopus database.

Table 4 illustrates that 63.26% of the research articles are published in the conference proceedings on average. That is, most research contributions are exhibited and published in the conferences (63.26%) in comparison with the book chapters (1.97%) and books (0.15%). Research publications in conferences appear to be the primary contributor to the research domain [65,66,67] because journals often require an extended period for reviewing and publication process. As VANETs is a fast-developing research domain, the timely illustration of the idea is largely preferred over the justification; otherwise, the research might become obsolete.

5 Validation for research

Validation for research defines the extent to which an outcome is recognized as a real and genuine picture of underlying strength. This section of the article represents how the count of citations influences the research.

5.1 Impact of citations in a research work

With a definite aim to evaluate the influence of research, the individual citation’s count was examined. It estimates three aspects as follows:

  1. 1.

    authors, conferences, and journals citation count,

  2. 2.

    influence of citations through single publication,

  3. 3.

    the persistence of the publication through the Normalized Citation Impact Index (NCII).

Concerning the productivity of research and the influence of research, the “Matthew effect” [68,69,70] describes that highly recognized researchers received an honor for their research contributions; many other research scholars present those. As honor received by the research partners influences the benefits to the other authors also who are attached to them. The author’s perceptibility and the publication are affected by policy registration patterns; for example, prominent journals view into institutes for research or research collective attempts [71]. While counts of previous sections provide a portion of information regarding the publishing designs, including important disciplines, this investigation’s significant interest is to evaluate the impact of contributions. The total citation count of a research publication is one of the criteria for measuring contributions’ impact. The total citation number of a particular research publication reveals its popularity through the number of times other publications cited it. The citation per publication varies from 0 to 227, as observed through the Scopus database. The influence of participation from alternative viewpoints is presented below.

5.1.1 The pattern for citing and citations per outlets

The distribution impact and citations were analyzed in the previous sections. Time impacts the total count of citations that the research publication receives, as stated in the literature. “Life span of the publications is defined as the total number of years that publication has been in edition,” according to NCII [33]. NCII is calculated as follows:

$$\begin{aligned} {\text {NCII}} = \frac{\sum _i C_i}{L}, \end{aligned}$$
(1)

where \(\sum _i C_i\) indicates the total count of citations per publication i and L stands for the publication longevity in recent years. Data displayed in Table 5 reveal exciting patterns concerning the general citations.

Table 5 Patterns for the general citation
Table 6 Citations percentage for each outlet

Additional citation analyzes that are based on different outlets of the publication were conducted. Table 6 depicts the primary source of references that are mainly from conference research articles (33.29%) and journal papers (23.77%). The publication’s outlet is explained in Sect. 4.4. In particular, the journal’s contributions gain more citations than other research outlets. On the other hand, books, and book chapters, have minimal research contributions. The observation done in this domain may conflict with other research domains, in which books and book chapters are often cited [29, 31, 33, 34, 36]. Although the contributions from the survey articles are limited in this field, as illustrated in Sect. 4.4, they still attracted a large number of citations because of the immense research interest in this domain.

5.1.2 Efficiency for individual research

Research efficiency is defined to identify high-performing institutes in research field and research experts from several domains. Obtained results will support the investigation of the highly productive research organizations and reveal their global distribution in research. This section outlines the research productivity for the institutes.

A few strategies are used to examine productivity, including (i) Direct count, (ii) Position of the author, (iii) Equal credits, and (iv) Normalized page span [33, 72]. These strategies are explained below.

The direct count strategy allocates one count to every co-author for the research publication. This method miscalculates the rights of articles with the single author and supports articles with the multi-author. On the other hand, criteria for the author’s position method provide a count to every author for communicating research publication depending on strategies recommended in [33, 73]. Therefore, this approach identifies that the foremost author for a research article acts as the primary contributor.

Fig. 6
figure 6

Top cited versus contributing affiliations

Figure 6 illustrates the research organization ranking that is dependent on their participation and contribution (contributed affiliation (f)). Institutions in the USA, Canada, and India display a stable and robust base for this parameter. Many of them are well-known reputed institutes. It helps identify highly qualified researchers, experts, and scientists to leverage their services in specialized domains. This influence is seen with Matthew’s effect as the researcher gets an advantage through the relevant affiliation and scheme benefits.

Bibliometric analyses are conducted to examine and evaluate the effectiveness and accomplishment of the authors [41]. This paper adopts an equivalent credit method, as mentioned earlier. In the approach of equal credit, each author holds a count depending on the reciprocal authorship. Table 7 depicts the well-known authors from the USA, India, Malaysia, Canada, and Brazil as the significant contributors in this research domain. On the other hand, the author’s position approach and the direct count method analyzed Mario Gerla as the most influencing authors in the data congestion area of VANETs. This author acts as an effective contributor in this field as this author has the maximum number of citations, as depicted in Table 7. This exhibits high impact authors can undoubtedly influence their institute’s reputation along with their honor. It conveys the interest and passion in other researchers to come and join their hands to work with the acknowledged, recognized, and famous authors to earn profits.

Table 7 Individual productivity and top cited authors

5.1.3 Citations distribution among various outlets

Minimal and short insights were contributed by the articles published in the conferences because it depends on the conference’s specific time. Therefore, we examined the distribution of publications dependent on outlets. Table 8 lists journals list and publication count (f), showing a broad range of standardized journal publications over the last decade in the topic data congestion in VANETs.

Table 8 Journal publication count
Table 9 Citations for the top 30 conferences
Table 10 Citations for the top 40 journals

Also, through the observations made on citations for the conferences in Table 9, it is revealed that the widely cited articles are frequently published in IEEE Vehicular Technology Conference due to the reason of its quality. IEEE Vehicular Technology Conference is a conference and proceedings covering the technologies/fields/categories related to Applied Mathematics; Computer Science Applications; Electrical and Electronic Engineering. The overall rank of the IEEE Vehicular Technology Conference is 17417. The above-mentioned conference H-Index is 105, Impact Factor is 0.85, and SCImago Journal Rank (SJR) is 0.209. Research with good quality and with a tremendous impact influences new research scholars toward that particular field.

Maximal citations are received by the articles published in conferences and journals, as depicted in Table 6. To examine the impact of citation count on the journals, a ranking has been generated for the journals in Table 10. A ranking on citation count’s impact has been generated for the journals in Table 10. As discussed in the previous section, to examine the citations pattern, a method known as the direct count is applied to distinguish between the influence of research and its productivity. The average count for each article and Impact Factor (IF) has a minimal relationship, as depicted by adding the journals impact factor in Table 10. Through this, it can be assumed that the influence of research articles that are published in the journals for a particular area is not hugely affected by the journal’s IF. Although the IF is applied to expose the importance of journals, still IF is a controversial discussion among researchers. The total count for the citation and the NCII score is estimated in Table 11 and examined to describe the best and topmost publications in the data congestion domain of Vehicular Ad hoc Networks. For contrasting results with different citation measures, a division of fG is introduced, including the Google Scholar citation count.

Table 11 Publication with the topmost citations

5.2 Output validation (top cited affiliations and authors)

The individual author’s productivity is collected for obtaining an outline of the most leading and influential author in the mentioned field. Citation count for an individual author over a decade is provided in Table 7 that further involves total citations count for an individual author to date, the author’s h-index, and i10-index. It was noted that Mario Gerla from the University of California (the USA) is currently the most prominent and influential author.

As shown in Fig. 3, India, and China are the best contributing countries comparing to the USA, Canada, Taiwan, and many other countries in terms of publications count. The individual author’s impact influenced by research affiliations is illustrated in Fig. 6, as it depicts the top cited and the research contributing organizations and universities.

6 Research insights obtained by the VOSviewer tool

VOSviewer is a popular software tool intended for creating and envisioning bibliometric networks [74, 75]. The mentioned networks may involve records of researchers, journals, publications, citations, keywords, conferences, and many more [76,77,78,79]. These networks can be created based on the co-authorship, co-citation, co-occurrence, citation, or bibliographic coupling relations [80, 81]. VOSviewer allows us to build and analyze the appropriate relationship between different aspects of research.

To carry out our bibliometric analysis work in this article for the selected topic “Data congestion in vehicular Ad hoc Networks,” we used a software tool known as VOSviewer to visualize the bibliographic networks, as it is very complicated to give an exact count for publications, journals, conferences, citations, and many more aspects accurately, and even it is not user-friendly to understand how the total count has been measured and obtained. From this perspective, this is a convenient tool that can be used by the researchers. However, when it comes to visualizing some aspects such as publication count, displaying the relationship among different authors, showing the frequently used keywords by the authors, illustrating the relationship among different research contributing countries, and many more, it is one of the commonly used software tool by the researchers.

To generate bibliometric visualization analysis using the software tool VOSviewer, we follow the six steps listed below [75, 79]:

  1. 1.

    Start by creating a map based on the bibliographic data.

  2. 2.

    Upload the Scopus file that has been downloaded from the Scopus database.

  3. 3.

    Allow VOSviewer to read the uploaded Scopus file, it might take a few seconds to successfully read the file.

  4. 4.

    Once the Scopus file is successfully read by the VOSviewer tool, click the Next button that simply leads us to a window for choosing a type of analysis and the counting method.

  5. 5.

    Select the type of analysis based on the data provided with respect to the desired type of analysis, explained below. For example, in concern with our bibliometric data, it provides four types of analysis.

  6. 6.

    Perform each type of analysis on various levels or units provided and go back to Step 5 until all the types are analyzed.

Table 12 Type 1 analysis: bibliometric coupling

6.1 Type 1 analysis: Bibliographic coupling

For the bibliographic data provided, the first type of analysis is “Bibliographic Coupling analysis” that provided the option to visualize data with three analytic units — Documents, Sources, and Countries. Table 12 depicts the detailed list of the observed features from the visualization layouts for the Type 1 analysis.

(i) Document unit of analysis A Bibliographic coupling analysis with the “Documents” unit of analysis performed. As shown in Fig. 7, a total sample of 434 documents in the selected topic are found during the visualization process. Some items (documents) in the network are not linked with others directly. Further visualizing gave the largest set of fully connected documents that relate to 388 items. This largest set of documents are clustered into 15 clusters, also known as research sub-teams. The lines running from one document to another depict the connection between the articles. Moreover, the largest cluster is formed with 55 items and denoted by red color. There are many more articles connected in this cluster if zoom in is made during the visualization process, but here only the screenshot of the overall clusters has been taken. The second-largest cluster comprises 45 items represented with green color, and the third cluster contains 43 items depicted by blue color. The fourth cluster contains 37 items illustrated with bright yellow color, followed by the fifth cluster containing 35 items and denoted by purple color in Fig. 7. Like the above-described clusters, ten more such clusters contain various items and represent themselves with distinct colors.

Fig. 7
figure 7

Bibliographic coupling analysis with the “Documents” unit of analysis

(ii) Source unit of analysis A bibliographic coupling analysis with the “Sources” unit of analysis is performed. Figure 8 depicts the relationship among various journals and conferences in which most of the articles are published over a decade for the selected area “Data Congestion in VANETs.” To analyze in a better way, a threshold has been applied as the minimum number of documents from each source should be two. So this threshold condition is fulfilled only by 60 sources out of 281 sources dealing with data congestion in VANETs. Figure 8 depicts that it forms eight different clusters with these 60 sources. The largest cluster has 12 items represented with red color. As mentioned above, for this analysis also, many other sources are connected to the cluster but not visible in this screenshot. The second cluster also contains 12 items denoted by green color. Then, the third cluster has 11 items illustrating through blue color. Clusters four and five have 8 and 7 items, respectively, each represented by yellow and pink color. Lastly, the sixth and seventh clusters contain four items, each depicting through cyan and orange color, followed by the eighth cluster having only two items depicted in brown color.

Fig. 8
figure 8

Bibliographic coupling analysis with the “Sources” unit of analysis

(iii) Country unit of analysis A bibliographic coupling analysis with the “Countries” unit of analysis is performed. Different countries are linked with each other through various means, either by documents, citations, or authors, as depicted in Fig. 9. Moreover, strong connections shown in Fig. 9 indicate that a cluster of countries is closely linked with the other clusters of countries. Here, we set the threshold of the minimum number of documents from each country to five. So this threshold condition is fulfilled mostly by 26 countries out of 58 total countries contributing to the selected topic. Figure 9 shows four clusters made up of 26 countries and illustrates the relationship between them through the lines drawn among them. The first cluster has 8 countries represented by red color. The second and third clusters both have 7 countries and illustrating with green and blue colors, respectively. The last cluster only has 4 countries depicted by yellow color. Countries depicted through large circles such as India, China, the USA, and Canada contributes significantly to the topic.

Fig. 9
figure 9

Bibliographic coupling analysis with the “Countries” unit of analysis

Table 13 Type 2 analysis: co-occurrence

6.2 Type 2 analysis: co-occurrence

The second type of analysis for the bibliographic data is “co-occurrence analysis” that allowed the option to visualize data with two units of analysis–All keywords, and Index keywords. Also, Table 13 depicts the detailed list of the observed features from the visualization layouts for the Type 2 analysis.

(i) All keywords unit of analysis A co-occurrence analysis with the “All keywords” unit of analysis is performed. As shown in Fig. 10, the relationship among all the keywords is illustrated that are being used over a decade in the selected topic. These are popular keywords used in the research documents by the authors. It is easy to depict the most frequently used keywords through the area, which is highlighted with a yellow color in Fig. 10. To analyze in a better way, a threshold has been applied as the minimum number of occurrences of a keyword should be five. This threshold condition is fulfilled only by 215 keywords out of 2,944 total keywords used for the data congestion in VANETs. Figure 10 also shows that during density visualization analysis, seven clusters are formed with these 215 keywords, and it represents the densest area of keywords that are frequently used in the topic. In this density visualization diagram, the first cluster contains 53 items highlighting the densest group of keywords; the second cluster contains 34 items mentioning the frequently used keywords; iteratively, five more clusters are formed to depict the frequently used keywords by various authors in the literature. As mentioned above, many other keywords are related to the cluster observed during the density visualization process.

Fig. 10
figure 10

Co-occurrence analysis with the “All keywords” unit of analysis

Fig. 11
figure 11

Co-occurrence analysis with the “Index keywords” unit of analysis

(ii) Index keywords unit of analysis A co-occurrence analysis with the “Index keywords” unit of analysis is performed. As shown in Fig. 11, the relationship among the index keywords is illustrated that are being used over a decade in the selected topic. For a clear visualization, a threshold is set according to the minimum number of occurrences of an index keyword that should be five. So this threshold condition is fulfilled only by 193 keywords out of 2,383 total keywords used for the data congestion in VANETs topic. This density visualization diagram depicts that seven clusters are formed with these 193 keywords, and it displays the densest area of index keywords that are frequently used in the topic. The first and second clusters contain 40 and 37 items, respectively. The most frequently used index keywords in this domain are shown. Hence, five more clusters are formed to depict the frequently used index keywords.

Table 14 Type 3 analysis: citation

6.3 Type 3 analysis: citation

For the bibliographic data provided, the third type of analysis is “Citation analysis” that helps to visualize with only one unit of analysis named as Countries. Table 14 lists the observed features from the visualization layouts for the Type 3 analysis.

Fig. 12
figure 12

Citation analysis with the “Countries” unit of analysis

(i) Countries unit of analysis Citation analysis with the “Countries” unit of analysis is performed. Figure 12 illustrates a relationship among all the countries contributing to citations on this research topic. For a clear view, two thresholds have been applied: (i) the minimum number of articles from each country should be at least one, and (ii) a minimum citations count for each country should be five. So this threshold condition is fulfilled only by 47 countries out of 58 total countries contributing to data congestion in VANETs. However, all the 47 items in our network are not linked to each other directly. Therefore, the largest set of connected items contains only 31 items that is 31 countries. As shown in Fig. 12, eight clusters are formed with these 31 countries. The first and second cluster contains the 5 items, depicted by red and green, respectively. Clusters three and four contain 4 items, each represented by a blue and yellow color, respectively. Likewise, four more clusters are formed to depict the citation analysis with the countries. Moreover, the countries with large circles have the highest citations over a decade compared to other countries represented through small circles. Therefore, countries such as India, China, and the USA have high citations compared to other countries.

6.4 Type 4 analysis: co-citation

For the bibliographic data provided, the fourth type of analysis is “Co-citation analysis” that helps to visualize with one unit of analysis named as cited authors. Table 15 depicts the detailed list of the observed features from the visualization layouts for the Type 4 analysis.

(i) Cited authors unit of analysis A Co-citation analysis with the “Cited authors” unit of analysis is performed. Figure 13 depicts the relationship among all the cited authors. A threshold has been applied as the minimum number of citations of an author should be fifteen. So this threshold condition is fulfilled only by 244 authors out of 10,973 total authors publishing their research work in the field of data congestion in VANETs. However, all the 244 items in our network are not linked to each other directly. Therefore, the largest set of connected items contains only 242 authors. Figure 13 shows that six clusters are formed with these 242 cited authors and reveal the connections among all the authors. The first cluster contains 78 items and denoting itself by red color, followed by the second cluster that consists of 39 items and is depicted through the green color. Clusters three and four both contains 37 items, each represented by a blue and yellow color, respectively. Lastly, cluster five contains 31 items illustrating through pink color followed by cluster sixth formed with 20 items painted in cyan.

Table 15 Type 4 analysis: co-citation
Fig. 13
figure 13

Co-citation analysis with the “Cited authors” unit of analysis

7 Future research projections

It is challenging to acquire an impression of the recent research trends through the literature reviews and surveys written by various authors. Therefore, this paper’s strategy and technique can be applied to examine and analyze any research domain. For future work, the relationship between various topics and authors can be examined and analyzed to recognize the pattern of research trends within the domain. The results presented in this article can be compared with other analysis article’s outcomes for generating new evaluations. The methodology implemented for the pre-processing of analysis data can be enhanced and improved to reduce manual involvement. In future work, the authors can also highlight establishing the relationship between different types of attacks that may lead to congestion in the network, and through all this, there will be a better understanding of the topic of data congestion in VANETs [82, 83]. Various loopholes, uncertainties, and interruptions in the VANETs model’s architecture can also be detected by extending this bibliometric analysis work. This analysis is required to enhance the data congestion control mechanism in the VANETS domain. This study’s outcomes can help to explain the research trends and other important factors that need to be considered in the VANETS domain [61]. There are plenty of other VANETs issues for future work, including less tolerate failure, reception of the signal, state of being connected, high mobility, and many alike.

Other than all the above future directions, there are a few topics in which bibliometric analysis can be conducted in future to have better knowledge and understanding for “Congestion in VANETs.” These topics are highlighted in Fig. 14, and all the topics are categorized into six main categories—security, data communication, routing protocols, managing accidents, reliability, and road congestion. By analyzing these topics through bibliometric analysis, many other issues related to the selected topic will also be resolved. Besides this, as inspiring results are produced in networking, several significant research problems of this area are yet to be considered in the future [7]. For researchers to identify unsolved problems and the current research trends in this vital domain, this section will discuss the above-mentioned six categories briefly.

Fig. 14
figure 14

Research directions derived according to our analysis

7.1 Integration of data congestion in VANETS with road congestion using machine learning concepts

In VANETs, exchanging information among V2V and V2I plays a crucial role in the entire communication scenario. Data congestion in the network leads to a delay in delivering emergency messages and causing an interruption in the network’s communication. When there is no proper communication among the vehicles moving on roads, and no appropriate information is broadcasted in the network, it will straightforwardly lead to road congestion. The integration of already existing clustering algorithms with the machine learning concept can be used to avoid the existing problems in the future [84,85,86,87]. It will enhance the safety of on-road vehicles through proper traffic management.

7.2 Security, privacy, and liability in the VANETs environment

VANETs security is a big issue that needs to be addressed urgently and thoroughly. This challenge requires the highest consideration even before the designing and deployment of VANETs scenarios [83, 88]. Nowadays, there are various possible threats in vehicular communication scenarios, such as fake/false messages that will be proficient enough for disrupting the entire traffic, which requires to be examined [89, 90]. Secure structures must be formed to enable the vehicles receiving data packets from their nearby vehicles to build trust for the entities broadcasting the data packets [90, 91]. The main challenge in VANETS is how to design and develop a safety framework that must be proficient enough to maintain the trade-off among authentication, privacy, and liability [92]. Future security frameworks should consider the safety of the information transmitted across the VANETs channel [19].

7.3 Broadcasting the data for communication in VANETs

Many researchers consider various approaches and techniques for broadcasting the data to initiate communication in the field of VANETs [93, 94]. These approaches involve limited and unlimited bandwidth digital assistance and satellite broadcasting assistance that has previously included real-time traffic statistics services [95, 96]. The broadcasting approaches used in the VANETs communication environment are associated with several broadcast storm challenges [46]. Many authors gave new research techniques to overcome these challenges but failed to address this problem completely. In the future, issues related to the broadcasting storm should be fully addressed. A novel approach is needed to mitigate the broadcast storm problem by dealing with low-level vehicle density and high-level vehicle density situations [97].

7.4 Improvement in the VANETs Ad hoc routing protocols

Much research works have been conducted to incorporate MANETs routing protocols in VANETs [98, 99]. Several recent investigations have examined the effectiveness of traditional Ad hoc routing, including protocols of MANETs for VANET situations [100]. When any predefined theories do not endure in VANETs, the approach named “carry and forward” was introduced in [101] for this domain. In this approach, a data packet is forwarded to a neighboring vehicle close to the target. However, this packet routing approach faced many challenges in VANETs. This problem can be mitigated if “carry and forward” approaches can be associated with various VANETs routing protocols, including opportunistic routing, geographic routing, and trajectory routing [102]. Also, in the future, more investigations and simulations should be done with elegant parameters and extend the current routing protocols for overcoming the challenges of high end-to-end delay and more packet dropping rate at the time of communication without significantly increasing the network overheads.

7.5 Techniques to control congestion in vehicular communication for managing the accidents

One of the main aims of VANETs is to reduce road accidents occurring almost every day while enhancing the performance of traffic [103, 104]. Several technical hurdles of controlling the congestion should be overcome for broadcasting the emergency and periodic beacons. While under control, the authenticity and scalability for transmitting safety messages should be ensured, particularly in congested scenarios. Much research has been conducted to authenticate and estimate the execution of techniques used for congestion control [101, 105]. Researchers have used many strategies to examine wireless transmission system’s efficiency, such as vehicular wireless transmission systems, simulation, or a field study approach. The performance outcomes acquired with the examined congestion control techniques in the literature reveal the need for Quality-of-Service (QoS) in secure VANET applications, for example, excellent reliability and less latency not confirmed by any of the examined techniques. Therefore, there is a high demand for future development and future deployment of vehicular communication techniques to manage congestion and reduce road accidents [104, 106].

7.6 Reliability and cross-layer approach in the VANETs domain

The vehicle-to-vehicle communication network is established with continuous network path break-up commences to broadcasting false messages due to the wireless environment of VANETs. This problem originates the hurdle of reliability and authenticity in the networks of vehicular communication [107, 108]. Numerous error restoration methods have been recommended and executed over the years to deliver reliable transferring of data packets in wireless communications of VANETs networks. Conventional methods, for example, ARQ (Automatic Repeat reQuest) [109] and FEC (Forward Error Correction) [110] could not generate the desired outcomes in vehicular communication environment yet. The approach ARQ can only be employed to assure reliability in point-to-point unicast VANETs communication [111]. On the other hand, FEC works only with an existing waiting queue of data packets. Every vehicle generates data packets regularly or automatically in the light of emergency and transmits to their nearby vehicles. Consequently, the challenge of communication reliability continues to be an open research hurdle in designing and deploying VANETs secure environment. Future works should develop powerful and sophisticated restoration techniques for lost packets to achieve reliable and effective vehicular communications.

8 Conclusion

This bibliometric analysis paper has examined and analyzed the literature on quantitative traits and features of data congestion topic in the VANETs domain to identify the research trend from the last ten years. This article presents a comprehensive illustration of research growth for the area “Data congestion in VANETs” during an appropriate period. As investigated through the Scopus database till December 31, 2019, we identified 11,109 publications linked to the VANETs domain. In particular, 434 publications are related to data congestion in VANETs area. All the publications examined in this article are extracted from the Scopus database for the time duration of 2010–2019. This bibliometric analysis examines the selected area trend and analyzes research publication’s growth by incorporating several parameters. In this article, the recurrence and frequency of key phrases used for a specific area in a particular period have been mentioned to analyze the growth in the selected research area. As per the data collected from the analysis, it is depicted that the Computer science and Engineering field contributes most to advance technologies. Therefore, it displays the technology itself is involved in this research area. Through the analysis, India and China are the highest contributing countries in terms of the research publications, followed by other countries like the USA, Canada, and Taiwan. Most of the publications are done with one or two co-authors; the publications done by the sole author are minimal. This finding emphasizes the collaborative research conducted in this domain. Conferences publications are the prime source for this domain’s contribution as it contributes to a significant portion as compared to the documents published in the journals and books. Nevertheless, the principal source for the references is from the documents published in the conferences and journals. This article is also presenting the well-known authors from the USA, India, Malaysia, Canada, and Brazil as the significant contributors in this research domain and analyzed author named “Mario Gerla,” from the USA as the most prominent and influential author in the data congestion area of VANETs. Moreover, the publication counts during the last decade for all famous journals in this domain that being preferred by most of the authors have been mentioned. We also examine citation counts for all the popular conferences and well-known journals. Last but not least, the software tool named VOSviewer is utilized to create and envision bibliometric networks of the selected topic “Data congestion in VANETs.”