Keywords

1 Introduction

Along with the popularity of mobile devices, hundreds of thousands of mobile applications (apps) emerge every single year. The market for mobile apps has been one of the fastest growing segments of mobile technology. Typically, mobile application platforms play a crucial role in app market interactions by collecting software launched by developers and distributing these apps to end-users, which provide a new way for developing, updating and downloading software applications for mobile devices [1]. App, as the main component of mobile platforms, is a form of product based on mobile technology. App developers design and optimize software programs to perform specific functions in an aim to satisfy customers’ needs, enhance market influence and gain profits. In fierce competitions, some apps may be in a dominant position whereas others take a weaker position. App developers adopt different strategies to allocate limited resources and optimize the app’s overall performance.

Researchers have been trying to understand the inner development patterns of apps and help developers find suitable business strategies. It has already been shown that the frequency of app updates is positively associated with the capability of the supplier and the developing potentials of the product [2]. Free App offers, high initial ranks, investment in less popular categories, high volume of user reviews can help an app to develop sustainably [3]. Their preliminary findings deliver an important message that development strategies may directly influence the performance of apps.

Development strategy serves as a crucial factor in an app’s development since it decides whether an app can attract new customers while maintaining present competitiveness and thus determines its market share and market growth. A development strategy can either be convergent or divergent [4]. The divergent strategy can be interpreted as innovation, which means that the developer exploits new functions and create new needs. At the same time, developers taking convergent strategies learn from present apps and perfect their own apps by adding on functions that others have already had to meet present needs. Both strategies can be useful approaches to enhance competitiveness.

However, the real world is more complicated. The impact of development strategies may be differentiated based on the various market status of the focal apps. For example, an app with an extremely high market share may gain more profit by charging a certain amount of fees since it monopolizes the market while the same may not work for one with low market share. To measure the market status of the apps, we explore patterns in app development more specifically by dividing them into different market status using Boston Consulting Group Matrix (BCG Matrix), which is a widely used tool for portfolio optimization and strategic justification [5]. What our study plans to do is to answer the question which strategy is better in attracting customers for apps in different market status, more specifically in different quadrants of BCG Matrix.

Development strategy can be hard to measure, but in light of text mining technique, we are able to convert text-based app descriptions into text vector and calculate the average vector that represents the general condition of the market. By measuring the distance between every app vector and the average vector, we are able to tell how divergent an app is. Then, it becomes feasible to apply empirical models to explore the relationship between development strategies and download numbers. Furthermore, in practice, a phenomenon is often associated with multiple factors. Multivariable linear regression analysis is created to find an optimal combination of independent variables to estimate or predict the dependent variable. Since the model has been widely applied and achieved significant performance in various fields, it is also adopted in this study. Thereby, we can address our research question of how development strategies interact with market status to influence the performance of an app.

The rest of this manuscript is organized as follows. In Sect. 2, we present the theoretical development of this research, including the text mining technique and the Boston Consulting Group Matrix. In Sect. 3, we present our hypotheses. In Sect. 4, we present our data collection, BCG Matrix distribution, text vectorization and distance calculation process. In Sect. 5, we apply empirical models to examine our hypotheses. In Sect. 6, we discuss this work’s contributions, implications and limitations and in Sect. 7, we conclude the study.

2 Theoretical Development

2.1 Text Mining for Convergent and Divergent Development Strategies

Most information of apps lies in the descriptions and update details provided by the operations. By applying text mining, the process of extracting non-trivial patterns or knowledge from text documents [6], to app analysis, we are able to dig out more interesting correlations and patterns from the mobile platforms apart from numerical fields such as downloads or rankings. Much research has been done exploring the texts in the mobile platforms. Maalej and Nabil [7] utilized multiple machine learning algorithms to automatically classify customer reviews into three types, bug report, feature request and simply praise. Finkelstein et al. [8] extracted app features from app descriptions and demonstrated the relationships between customer, business and apps’ technical characteristics. Kim et al. [9] adopted keyword vectors in the network analysis of apps.

Nonetheless, little research has been done studying the convergent and divergent development strategies of apps using text mining techniques. According to Gallouj [10], most innovations in digital products are incremental or recombinative. The former took advantage of pre-existing technical characteristics and services whereas the latter developed new features and functions [10]. During the process of developing an app, an operator can either choose to exploit new functions that have not appeared in the market yet or to apply and consummate former functions other apps have al-ready developed to meet market needs. Thus we defined apps with more recombinative features as apps taking convergent development strategies and those with more incremental features as apps taking divergent development strategies [4].

Such development patterns can be examined by text mining techniques like text vectorization. Apps with analogous functions tend to have similar descriptions and update information, which is to say, they may use similar words and expressions. By converting each app description into vectors based on the appearance frequency of words in it and calculate the distance between vectors, we find a way to explain the difference between apps. Research has been done on relevant fields such as Bibliometric and Econometrics. Dias [11] studied 20 million scientific papers over three decades using abstract vectors and found that on average the similarity between disciplines has not changed, but certain areas (e.g. computer science) are becoming increasingly central. Similar methods were applied to analyses product descriptions, which showed that product differentiation significantly improved market gains [12].

2.2 Boston Consulting Group Matrix

The BCG Matrix, developed by the Boston consulting group in the 1970 s, has been used extensively as a portfolio management tool for business. It uses two dimensions namely market growth which represents the extent of industry attractiveness and market share which stands for competitive advantage as a basis for categorizing business units [13]. By dividing products into four quadrants based on the two dimensions, researchers are able to custom specific strategies for them in accordance with their market potentials and competitiveness.

Stars operate in high growth industries and maintain high market share, which indicates that they are both cash generators and cash users. They are the primary units in which the company should invest its money. Cash cows are of high market share and slowly growing market. They are the most profitable brands and should be “milked” to provide as much cash as possible, which can later be invested to stars. Problem Children operate in high growth industries but have low market share. They require much closer consideration since they consume a large amount of cash and incurring losses but have the potential to become stars. Dogs hold low market share in a slowly growing market. In general, they are not worth investing in and should be liquidated. Nevertheless, Hambrick [14] pointed out in his research on the PIMS database that dogs had average net positive cash flow on investment of 3.4%. Thus, it is irrational to merely abandon products in the dog quadrant. Instead, what is needed is creative, positive research and thinking about how dogs can be managed for maximum long term performance [14].

In this study, the BCG matrix is adopted to classify the mobile apps to the four types. The market share and market growth rate are measured by the downloads. Apps of Stars are of high downloads and the downloads are still growing rapidly over time whereas the downloads for apps of cash cows are generally high but stable. Problem children’s downloads are low but at the same time growing fast. Downloads for Dogs are low and growing slowly. These four types represent four typical market status and by testing how development strategies influence the performance of an app in different status, we are able to establish a better and more specific understanding of the mobile app market.

3 Hypothesis Development

According to microeconomic theory, the market is classified into four types as perfect competition, monopoly competition, oligopoly and perfect monopoly with the degree of competition and monopoly [15]. This division is based on traditional industry, where competition and monopoly of market present antagonistic relations. However, all of the four types failed to describe network economy because competition and monopoly can reinforce and promote each other there. Enterprise gain and maintain monopoly position via technological innovation competition, which is called by Chinese researchers as Competitive Monopoly [16]. Su et al. [17] concluded Internet market structure as Hierarchical Monopoly and Competition. On the one hand, large Internet enterprises occupy a huge number of user resources, leading to highly concentrated market shares in certain fields. On the other hand, the great success of such enterprises also attracts large quantities of small and medium-sized Internet companies to enter the market or transform from traditional industries to Internet industries. Nonetheless, such entry of enterprises hardly changes the high concentration status of the market and result in withdrawal of quite a few operations. Therefore, high liquidity of small and medium-sized Internet enterprises in and out of the market, together with the relative stability of large Internet enterprises’ monopoly status, formed the specific “hierarchical monopoly and competition” market structure.

In the BCG Matrix, apps of Stars show the features of high market share and high market growth. The former symbolizes monopoly status of these apps as Internet market structure demonstrates while the latter implies that they lure consumers continuously. Fast expansion of monopolistic products indicates immaturity of markets they situate in. These products can attract potential customers who have never ever used them or similar products. Therefore, to keep high market growth for apps of Star is to keep attracting possible consumers, which requires developers to design apps in a popular style. The reason refers to customer acceptance of a new product as follows.

Rogers [18] proposed a theoretical framework about what factors affect the diffusion of innovation. He proved that consumers are less likely to accept a new product when perceived product innovation is complicated. Davis [19] provided an understanding of determinants of usage applying to Information Technology as Technology Acceptance Model (TAM), which believed that two factors, namely perceived usefulness and perceived ease of use, predict the attitude towards usage intentions.

Hence, when users who are not attached to any homogeneous apps intend to try a new one, it is more possible for them to use this app if they can know how to operate it quickly. Convergent development strategies, recombining functions developed by both homogeneous and heterogeneous apps help consumers find familiar operation method. Thus we hypothesize the following:

H1: Apps of Stars with convergent development strategies are more likely to achieve better performance.

Apps of Cash Cows present the characteristic of high market share and low market growth. These apps enjoy monopoly status like apps of Stars whereas locating in a mature market. Since transferring potential consumers is hard for them, the major task turns to reduce the loss of customers and converting consumers from competitors.

Jackson [20] defined switching costs as psychological and economic costs of changing supplies, including learning costs, transaction costs and artificial costs imposed by firms, such as repeat-purchase discounts [21]. Before the customers ever purchase one particular product, they have no ties to it since the transaction costs and learning costs are to be the same. However, once they got used to one particular product, the switching costs have been built up and it is less likely for them to use another product with similar functions. Therefore, for cash cows who already own a large number of customers, the crucial thing is to maintain the quantity and increasing the switching cost is a direct way to guarantee it.

The reason for apps of Cash Cows choosing divergent development strategies is that differentiation enhances customer switching costs. By exploiting new functions that have not appeared in the market, monopolistic apps improve customer loyalty and finally maintain monopoly rents. Thus we develop the following hypothesis:

H2: Apps of Cash Cows with convergent development strategies are more likely to achieve better performance.

For decades the debate was dominated by antagonism between a negative ‘Schumpeter effect’ versus a positive ‘Arrow effect’ of competition on innovation [22]. A model was proposed by Hashmi [23] to deal with the argument, known as an inverted-U relationship, that if the initial degree of competition is low, the inverted-U predicts a positive impact of rising competition on innovation effort whereas at high levels of initial rivalry, increasing competition reduces the incentives for innovation. Based on Hashmi’s model, Zheng et al. conducted a research to examine the relationship between competition and innovation in E-business market, the result of which showed a positive effect between not only competition and innovation but also innovation and performance of companies [24]. As mentioned above that small and medium-sized Internet companies confront fierce competitions, so they need tremendous innovations to stand out. Apps of Problem Children and Dogs have com-mon features of low market share, so we conclude our hypothesis as follows:

H3: Apps in Problem Children and Dogs with divergent development strategies are more likely to achieve better performance.

4 Data Description and Processing

4.1 Sampling and Data Collection

To test our theoretical hypotheses, we examined a dataset containing data collected from five typical Android mobile platforms in China, which are Baidu mobile assistant, 360 mobile assistant. Eoemarket, Mumayi and Appchina. These mobile platforms contain details including introduction, update information, number of downloads, category, and ranking for every single app. Customers will decide whether to download the APK of an app through the platform according to the information provided. The method of conducting empirical studies with second-hand data is widely used in information systems research [25,26,27].

By implementing a web crawler to collect data of 3000 apps every other week from January 1st 2018 to October 1st 2018 from the five platforms, we managed to obtain high-quality data reflecting basic information and market status of the apps along with its changes over time. We further excluded apps which were withdrawn by the platform during the ten-month collection process or those lacking key attribute fields like introduction or downloads. Eventually, we narrowed our dataset to 1805 apps quanlified for analysis.

4.2 Measuring Convergent and Divergent of Apps

In an aim to explore the impact of divergent and convergent development strategy on apps in different market status, we need to classify the apps into the four quadrants in the Boston Consulting Group Matrix according to their growth rate and market share. In this article, we define an annual growth rate over 10% as a high growth rate and a relative market share of over 20% as a high market share. Let A be the set of all apps in the dataset and P be the set of all mobile platforms. The formulas are as followed.

$$ g_{i} = \frac{{final\;download_{i} - initial\;download_{i} }}{{initial\;download_{i} }} \times 100\% \;(i \in A) $$
(1)
$$ s_{ij} = \frac{{download_{ij} }}{{max\;downloads_{j} }} \times 100\%\, (j \in P i \in P_{j} \mathop \cap \nolimits A) $$
(2)

We utilized the downloads from January 1st to August 1st to calculate the growth rate for each app. Since this time period only covers two-thirds of an entire year, we marked apps with a growth rate more than 6.66% as fast-growing apps. For market share, due largely to the divergence in market size and user numbers, the downloads for each mobile platform are significantly different from one another. For instance, the mean download number for Eoemarket is 451,533 while the average download number in 360 mobile assistant is 13,561,666. It is unreasonable to mark an app with 600,000 downloads in Eoemarket as low-share apps while regarding an app with 1,000,000 downloads in 360 mobile assistant as high-share apps. Taking this problem into consideration, we found the maximum downloads in August 1st for each platform separately as the max downloads and calculated the market share for apps based on the platforms they were in.

4.3 Measuring Convergent and Divergent of Apps

In the text vectorization process, we regarded the description along with all its update information between January 1st and August 1st as the representation of an app. We later used a stable Chinese word segmentation tool in Python called Jieba to segment the texts, eliminated words of low information volume according to the stop-word list provided by Haerbin Institute of Technology and created a dictionary of all terms in the corpus. Every app was then represented in the form of a vector, and each element in the vector was the frequency at which each word in the text appeared. We normalized each vector to unit length so that the result would not be affected by the length of the descriptions. The vectors can be represented as \( P_{i} \).

Apps belong to different categories (e.g. games, entertainment, music, tools, etc.), in which they have distinct functions and features. Therefore, we calculated the market standardization vector for each category by calculating the average number of each vector component.

The distance was then measured by the Manhattan distance between the app itself and the standardization vector of the category it belongs to. Let \( P_{i} (x_{1} ,x_{2} ,x_{3} , \ldots ,x_{n} ) \) be the vector and \( S_{i} (y_{1} ,y_{2} ,y_{3} , \ldots ,y_{n} ) \) be its standardization vector. The formula is as followed. Generally, the larger the distance is, the more divergent the development strategy of one app is.

$$ D_{i} = \sum\nolimits_{i = 1}^{n} {\left| {x_{i} - y_{i} } \right|} $$
(3)

5 Estimation Procedure

In this section, we examine whether convergent or divergent development strategies improve the performance of apps in different market status.

5.1 Measurement

Each of I = 1, …, I apps possess a certain market status. As mentioned in Sect. 4.2, we identify the position of an app in the BCG matrix by criteria of 10% growth rate and 20% relative market share. We use BCG matrix position dummies to reflect apps’ market status. Apps with more than 10% growth rate and 20% relative market are Stars, denoted as \( PositionCashCow_{i} = 0 \), \( PositionProblemChildren_{i} = 0 \) and \( PositionDogs_{i} = 0 \). Similarly, we measure Cash Cow, Problem Children and Dogs apps and denote them as \( PositionCashCow_{i} = 1 \), \( PositionProblemChildren_{i} = 1 \) and \( PositionDogs_{i} = 1 \), respectively.

Using the analytical approach proposed in Sect. 4.3, we use normalized Manhattan distance with regard to the degree of divergence of an app, denoted as \( Manhattan_{i} \). The larger the number \( Manhattan_{i} \) present, the more divergent \( app_{i} \) is. Whereas small \( Manhattan_{i} \) means \( app_{i} \) epitomizes the existing functions in the market.

Our data come from multiple mobile platforms: (1) Mumayi, (2) Baidu mobile assistant, (3) 360 mobile assistant, (4) Eoemarket, and (5) App China. To control the fact that mobile platforms are different from each other due to the difference of subscriber in many ways, including number, preference and so on, we measure platform dummies, called platforms.

Apps become more sophisticated when they increase in size, which means consumers need longer time to download and try those new apps. Therefore, we use the file size of an app in megabytes, denoted as Size, to control the relationship between file size and performance of an app.

Zhou et al. [2] thought Apps that update at a faster rate are more likely to achieve better performance. Apps fixed issues and provide more features according to update, which means a quick update lead to quick quality improvement. Hence, users are more confident in adopting the updated apps and allocate more time in them. We measure the times that apps update from Jan 1st to Aug 1st as UpdateCount to describe the effect.

Consumers tend to choose products that they can get more information. That’s to say, the length of an app’s description (called deslen) may have an effect on its performance. Moreover, the performance of apps at a point of time denoted as \( Pref_{t} \) is followed from the previous performance, denoted as \( Pref_{t - 1} \), the correlation of which should be eliminated.

5.2 Model

To examine our hypothesis of the effect of development strategy, we estimated the following empirical model:

$$ \begin{aligned} Pref_{it} = & \beta_{0} + \beta_{1} *PositionCashCow_{i} + \beta_{2} *PositionProblemChildren_{i} \\ & + \beta_{3} *PositionDogs_{i} + \beta_{4} *Manhattan_{i} + \beta_{5} *PositionCashCow_{i} \\ & *Manhattan_{i} + \beta_{6} *PositionProblemChildren_{i} \\ & *Manhattan_{i} + \beta_{7} *PositionDogs_{i} *Manhattan_{i} + \beta_{8} *Size_{i} \\ & + \beta_{9} *Pref_{it - 1} + \beta_{10} *UpdateCount_{i} + \beta_{11} *DesLen_{i} + Platform_{i} + \varepsilon_{i} \\ \end{aligned} $$
(4)

where \( i \) indexes the apps. The dependent variables, \( Pref_{it} \), is the number of download of app indexes i in Oct 1st and \( Pref_{it - 1} \) is the number of download of app indexes i in Aug 1st, correspondingly.

The model above describes the relationship between performance and development strategy of apps in different market status. The coefficient of \( Manhattan_{i} \) captures the main effect of development strategy on app performance. Then the interaction term of \( {\text{PositionCashCow}}_{i} *Manhattan_{i} \), \( {\text{PositionProblemChildren}}_{i} *Manhattan_{i} \), \( {\text{PositionDogs}}_{i} \) * \( Manhattan_{i} \) reflect the moderating role of market status in the effectiveness of development strategy.

5.3 Result

Table 1 reports the parameter estimation of the model above. We discuss the result in four different market status.

Table 1. Estimation result

For apps in Stars, the development strategy coefficients (\( \upbeta_{4} \)) is significant indicating that divergent development strategy has a negative effect on app performance. On the contrary, apps in Star which epitomizes more functions in the market are more likely to obtain better performance. Therefore, H1 is supported.

With regard to the apps in Cash Cow, Fig. 1 reveals the interaction between market status and development strategy. As Table 1 indicates, the parameter estimation of \( {\text{PositionCashCow}}_{i} *Manhattan_{i} \) suggests that the influence of divergent development strategy for apps in Cash Cow increase 84,320,000 relative to apps in Star. H2 is supported.

Fig. 1.
figure 1

Downloads of Cash Cows, Problem Children, and Dogs apps under low and high Manhattan distance.

Similarly, the parameter estimation of and \( {\text{PositionDogs}}_{i} *Manhattan_{i} \) in Table 1 implies that the influence of divergent development strategy for apps in Problem Children and Dogs increase 72,240,000 and 73,790,000 relative to apps in Star respectively. H3 is supported.

6 Discussion

6.1 Contributions and Managerial Implications

This paper has three contributions to the existing theory. First, this research is a supplement of prior studies of determinants of mobile apps’ success. Lee and Raghu [3] proposed that the feature of categories apps belong to is key to the longevity of apps in top charts. Zhou et al. [2] pointed out that update speed is positively associated with the performance of apps based on the ecosystem of the software platform. However, there are not relative researches about development strategies. Our study extended the framework on how to achieve better performance for an app by adding the relationship between development strategies and app performance. Moreover, the division into divergence and convergence using text mining techniques provide a method to measure development strategies.

Second, we introduced the interaction of market status. Supposed that a certain development strategy makes a varying effect on apps in different circumstances, we accepted market status as a typical kind of circumstance and utilized BCG matrix to identify it. The thought, refining the effect of development strategies could be used to improve similar researches such as the influence of update speed on mobile apps’ performance.

Third, we found a way to measure the divergence of a development strategy quantitatively using text vectorization techniques, making it possible to take advantage of abundant text information in mobile platforms. Such techniques can be further improved in the future to explore all kinds of relationship in mobile app market.

Forth, we used market structure to explain the relationship between development strategies and performance, which made traditional industry theory migrated to Internet field. Our study made a little contribution to building up the theoretical framework of this new industry.

Prior studies in software management market ignored the fact of apps under different conditions. A general conclusion usually does not suit a certain app. This study made progress to refine result into different market status. So developers in the app market can easier make decisions according to their certain conditions.

6.2 Limitations and Further Directions

The study is subject to several limitations and could also be extended in several ways. First, our data came only from the Android market. Future research may examine whether we can get the same conclusion when it applies to the Apple market since there is a difference of users between the two markets. However, data availability restrictions prevent us from using the number of downloads to measure the performance of mobile apps in the Apple market. We may utilize some different models.

Second, endogeneity is another issue. Although we adopted description length, file size, update frequency, platform and previous performance to control idiosyncratic and time-constant unobserved characteristics associated with each app, the control variables are far from complete. For example, we neglected the divergence of categories and the maturity of apps. We will consider more comprehensive covariates in further study.

Third, we assumed a simple linear connection between development strategies and performance of mobile applications. Further research may include a more complex model to describe the complicity of the mobile app market.

7 Conclusion

In this paper, we empirically evaluate what development strategies most suitable for mobile applications in different market status. To do so, we estimate a multivariable linear regression of the performance of apps, with development strategies and market status. Specifically, we found that apps of Star require convergent development strategies to attract potential consumers. More generally, we demonstrate that divergent development strategies benefit apps in other quadrants of the BCG matrix. The conclusion of this study has certain guiding significance for mobile application developers.