Keywords

1 Introduction

In the second quarter of 2017 about 340 million smartphones were sold according to IDC [1] and almost 2 billion smartphones will be sold worldwide throughout this year alone and the numbers are still rising [2]. The number of mobile apps that have been downloaded in this quarter is even more astonishing, with an estimated 25 billion downloads across all platforms in the second quarter of 2017 [3]. However, although the market that app vendors can target is huge, there is high risk for vendors of their apps not being downloaded or even found at all. On average, only 10–11 apps are used by individuals in the US per day and throughout a month this number only rises to about 35. Yet, if a vendor can make it into this exclusive selection, the benefits are tremendous as the interaction with those few apps reaches about 140 min per person per day in the US. It follows that intense competition exists among app vendors, and it is particularly difficult for unknown vendors to enter the market due to users’ uncertainty perceptions.

In the context of online transactions, previous research has shown that trust in the vendor and its products is essential as it can reduce the perceived risk of such a transaction (e.g., [4,5,6,7]). In the context of mobile apps we focus specifically on the landing page of an app in the app store as an important touchpoint where relevant information can be provided to potential users that is supposed to guide their purchase decision as elements of this page may build their trust in the app vendor. Based on these assumptions, we propose the following research question:

  • Does e-trust, formed based on elements of an app landing page, influence the intention of an individual to conduct a transaction (i.e., download or buy the app)?

In order to test the influence of trust in the app vendor based on information provided on the app landing page we conducted online experiments with a sample of potential users. In the next section, we outline our theoretical foundation including hypotheses derived from a review of previous research. Then, in Sect. 3, we provide details on our methods including the design of our stimulus material and the procedure of our online experiments. The results of our experiments are then described in Sect. 4, followed by a discussion and concluding remarks in Sect. 5.

2 Theoretical Foundation

Human interaction always entails a degree of uncertainty, as we are not able to predict with high certainty how our interaction partners will behave or what they intend to do. Even worse, it is almost impossible for us to do so, as individuals often act based on bounded rationality and may show behaviors that significantly deviate from how they may have behaved in comparable interactions before, without any explicit reasoning. Hence, it is often necessary to act based on assumptions, with one of the most profound assumptions being the trustworthiness of an interaction partner, which includes the expectation that the other party will act according to socially accepted standards.

Due to the particular importance of this scenario in the context of commercial transactions, trust has been a construct of research interest for many decades [8] and in the last two decades it has also become a primary construct in information systems (IS) research (e.g., [9,10,11,12,13,14,15,16,17,18,19]). Reichheld and Schefter [20] highlighted that trust has particular significance in online environments as there is (i) usually no contact between buyer and seller, (ii) there is no physical point-of-sale where individuals could check the quality of the product, (iii) decision-making is mostly based on descriptive information (e.g., textual descriptions and images of products), which is typically only provided by the selling party. Hence, there is much potential for consumers to be the victims of fraudulent behaviors such as misuse of provided data (e.g., credit card information), unfair pricing strategies (e.g., higher prices if they are not transparent across transactions), or incorrect information (e.g., unreliable visual depictions of the current state of a product) [4].

Despite the level of attention that trust has received in research thus far, no agreement on a common definition of trust has been reached. While Gefen [14] has defined trust as “…the confidence a person has in his or her favorable expectations of what other people will do, based, in many cases, on previous interactions” (p. 726) this definition is not completely applicable to our context, which is mostly characterized by one-time interactions. Therefore, Gefen et al. [4] later on proposed that trust is “…one’s belief that the other party will behave in a dependable, ethical, and socially appropriate manner” which more clearly reflects our understanding of trust as a form of personal attitude towards another individual or group. Directly related to online transactions, Pavlou [6] provided a similar definition, stating that trust in electronic transactions is “…the subjective probability with which consumers believe that an online transaction with a Web retailer will occur in a manner consistent with their expectations.” (p. 817). We posit that this subjective likelihood also drives the intention of individuals to complete an online transaction to download an app that is of interest to them [21, 22]. We therefore follow a notion by Reichheld and Schefter [20] who stated that “Price does not rule the Web; trust does.” (p. 106).

In order to frame our investigation, we base our theoretical model on the conceptualization of trust in electronic transactions (e-Trust) provided by Gefen and Straub [15], which will be outlined in the next section.

2.1 E-Trust and Transaction Intention

Based on the empirical research by McKnight et al. [23], Gefen and Straub [15] proposed a theoretical framework to study trust in electronic environments. In this framework, trust is a multidimensional construct that comprises four components: integrity, benevolence, ability, and predictability.

Integrity refers to the perception that one’s interaction partner will behave in accordance with what was promised beforehand. This basically boils down to the vendor being honest or dishonest about his later intentions to fulfill a made deal (e.g., whether product specifications are a true indication of what the product will actually be like [7]).

Benevolence refers to the individual perception that one’s interaction partner actually cares about the well-being of his transaction partners. In the app vendor context high perceived benevolence results in the belief that the vendor is not only interested in one-time sales, but in a long lasting relationship with its customers.

Ability refers to the individual perception that one’s interaction partner is capable of fulfilling a made deal. This component can be particularly important for innovative or complex products for which few vendors exist or previous vendors might have already failed. In our context, an app that is seemingly comparable to many others will not have a high threshold for its vendor to be perceived as able to deliver the expected functionalities or services.

Lastly, predictability refers to the individual perception that one’s interaction partner will reliably deliver the promised product or service particularly in the case of repeated transactions. In the case of mobile apps this entails, for example, that potential users perceive the app vendor as being able to deliver an app that will work over a long period of time (e.g., an app that works despite frequent mobile operating system updates).

We expect that each of these components of e-Trust will be positively related to the transaction intention of a potential mobile app user. In our context, we use transaction intention to refer to the behavioral intention of an individual to download or even purchase a specific mobile app. We therefore posit H1 with sub-hypotheses for each of these components as follows:

H1a–d::

Perceived integrity, benevolence, ability, predictability of an app vendor, based on the elements on an app landing page, is positively related to individual transaction intention.

We can also expect that, based on the context of a specific app, the degree of explained variance for each of these components will vary. For example, integrity may be more important in the context of apps that require a high level of data privacy (e.g., apps that are used to track medical issues); benevolence may be more important in the context of long lasting relationships (e.g., for apps that are intended to be used frequently and over a long period of time); ability may be more important in the context of novel or complex apps that make new promises with which potential users do not have experience (e.g., apps that use smartphone sensors to measure heart rate); and predictability may be more important in the context of apps that are costly.

As will be laid out in detail below, we focus on sport tracking apps and therefore surmise that integrity, in particular, will be of higher importance than the other e-Trust components in our specific context; hence, ability and predictability are of lower importance (i.e., cost-free apps that are offered with comparable functionality by many vendors).

2.2 Individual Characteristics

An individual characteristic that is an antecedent of perceived e-Trust and also used in the theoretical framework of Gefen and Straub [15] is an individual’s disposition to trust other parties. This construct has previously been introduced into the study of trust in electronic environments by Gefen [14] who stated that “[d]isposition to trust is a general, i.e. not situation specific, inclination to display faith in humanity and to adopt a trusting stance toward others (…).” (p. 726). In the study by Gefen and Straub [15], trusting disposition was the strongest predictor for most components of e-Trust (with the exception of benevolence), though other studies did not find such a relationship (e.g., [18]). We are therefore interested in the investigation of this connection and posit that individuals high in trusting disposition will be more likely to show high levels of e-Trust. More specifically, we formulate the following hypotheses:

H2a–d::

Individual trusting disposition is positively related to perceived integrity, benevolence, ability, predictability of an app vendor.

As trusting disposition is formed through life experiences and is therefore age-dependent [14], we further include age as an individual characteristic that will have an influence on e-Trust either directly or indirectly through its effect on individual trusting disposition (e.g., [24]). As individuals tend to become less likely to form initial trust due to their life experiences with age, we formulate the following hypotheses:

H3::

Trusting disposition is lower in older individuals than in younger individuals.

H4 a–d::

Older individuals report lower levels of perceived integrity, benevolence, ability, predictability of an app vendor than younger individuals.

Finally, we also expect gender differences regarding trusting disposition and perceived e-Trust. For example, Riedl et al. [11] found that the brain activations (fMRI scans) for men and women differed when they had to evaluate trustworthy and untrustworthy Internet offers, with women showing activation in more brain areas overall and in limbic areas. This could be linked to a greater risk perception in women and we therefore posit that women will be less likely to show high levels of e-Trust or trusting disposition, according to the following hypotheses:

H5::

Trusting disposition is lower in women than in men.

H6 a–d::

Women report lower levels of perceived integrity, benevolence, ability, predictability of an app vendor than men.

The resulting theoretical model that is the basis for our empirical investigation including the proposed hypotheses is shown in Fig. 1.

Fig. 1.
figure 1

Theoretical model and hypotheses

3 Methods

In order to test our hypotheses, we conducted an online experiment which aimed for high levels of external validity. To this end, we created a variety of stimuli based on actual mobile apps that are currently available in Google’s Play Store (for Android devices) and Apple’s App Store (for iOS devices). These stimuli were representative of the information that an app vendor as well as previous users can provide about a mobile app within the context of an app landing page in one of these app stores.

The development of these stimuli, the self-report instruments used to collect data on the respective constructs in our theoretical model and the procedure of our online experiment are laid out in the following sections.

3.1 Stimuli Development

For our stimuli we used exclusively real world examples drawn from sport tracking apps that are currently available in several mobile app stores. We chose sport tracking apps specifically, because there are currently many apps on the market that provide comparable functionality, with many of them provided by widely unknown vendors. We drew study material from the app landing pages of Runtastic ResultsFootnote 1, an app that is most popular in Europe, RunkeeperFootnote 2, an app that is most popular in North America, and Running for weight lossFootnote 3 as well as Freeletics bodyweightFootnote 4, apps that have a more international audience. To create variations of our stimulus material we targeted those aspects of an app landing page that could provide pivotal information to potential users. Specifically, we were interested in information provided by the app vendor (i.e., description of the app and pictures of the app) as well as other users (i.e., ratings and reviews).

App Descriptions.

The description of an app is amongst the most informative elements for a potential user to get a feeling of what the capabilities of an app are [25]. According to Gefen et al. [7], such descriptions are also important to build trust in vendors. Two dimensions that are critical in this context are the length of such a description text and how neatly it is structured, with a short and concise synopsis of the app’s uses and functionalities being a desirable characteristic [25]. To assess these dimensions, we used the coding scheme provided by Reiners [26], which includes guidelines on the composition of a sentence (e.g., how many verbs and words in total). Though this scheme was developed for the German language, we abstracted it in a way that made it applicable to English descriptions as well, i.e., we focused on the overall length of a description and the average number of words per sentence to distinguish long and unstructured descriptions from short and structured ones. This led, for example, to an app description with a total of 447 words and an average of 10.4 words per sentence being selected as the short and structured variant for English-speaking iOS users and an app description with a total of 568 words and an average of 12.9 words per sentence being selected as the long and unstructured variant for the same user group.

Visual Depictions.

We chose three different types of pictures that are frequently used on app landing pages in this category. The first variation is a simple screenshot taken within the respective app that captures some major parts of its functionality (Fig. 2); the second variation also included an in-app screenshot combined with some advertising message that is less informative to potential users (Fig. 3); and the third variation included an advertising image that does not include any actual footage from the mobile app, but is only used to promote its potential benefits (Fig. 4).

Fig. 2.
figure 2

Product screenshot

Fig. 3.
figure 3

Screenshot with text

Fig. 4.
figure 4

Advertising image

Ratings and Reviews.

In previous research it has been found that also second-hand information in the form of user ratings and reviews of mobile apps can have an influence on their success [27]. Amongst these two elements, ratings (mostly scales of one to five stars) are usually assessed first, before the longer user reviews are examined [28]. Amongst a large number of potential mobile apps that offer comparable functionalities, those that received distinctly negative ratings are usually sorted out first [5, 29]. In the case of overall comparable ratings (e.g., several apps with high ratings), reviews are then used as a source of second-hand information by potential users [24], in particular to understand the benefits and functionalities of an app and perhaps to gather more information on the reasons for negative ratings [30].

To manipulate these elements of an app landing page, we gathered two actual reviews with ratings from the Runtastic Results app and changed the user name as well as the data of the review. The three variations we created here were (i) a five-star rating together with an overall negative review, (ii) a one-star rating together with an overall positive review, and (iii) a five-star rating together with a positive review of the app, which was intended to be the most useful variant as the information of rating and review supported each other.

For each of our eight stimulus variations (see Table 1) we created versions based on either Android or iOS versions of the landing pages of our selected apps. In addition, we created versions for each stimulus in English as well as in German. It has to be noted though that we removed the brand of each app and replaced it with the fictitious “FitnessApp”. This was done because the brand can have significant influence on individual trust perceptions, particularly if individuals have had previous experiences with a certain brand [18]. As we were interested in initial trust formation in the absence of such experiences, we removed brand names and logos. An overview of our resulting 32 stimuli variations (Variations × Mobile Ecosystem × Language) can be found in Table 1.

Table 1. Overview of stimuli variations

3.2 Measurement Scales

The measurement scales that we used in our study to gather data on trusting disposition, e-Trust, and transaction intention were taken from Gefen and Straub [15] (see Table 2). We used a seven-point Likert scale ranging from (1) strongly disagree to (7) strongly agree.

Table 2. Overview of measurement scales and items used

Our demographic variables gender and age were both measured dichotomously. For age we used the two classes of “younger” and “older” individuals, where younger meant below 30 years and older meant 30 years or older. The reason for this specific split was that individuals younger than 30 years have mostly grown up with mobile phones or even smartphones (introduction of the iPhone in 2007) and are therefore likely more familiar with app transactions, while individuals older than 30 years may have more transaction experiences overall (traditional and electronic commerce in general), but not in this particular area.

3.3 Procedure

We conducted our online experiments from the end of May 2017 until the beginning of July 2017. We provided the link to the online survey tool (QuestionPro) on ClickworkerFootnote 5 and Amazon’s Mechanical TurkFootnote 6. When starting the experiment, participants first indicated their language and which mobile operating system they mainly used and accordingly were only presented with the versions of our stimuli that were developed for their language and operating system. Our online experiment then followed a between-subjects design for each of the three components of the app landing page that we manipulated.

More specifically, participants saw one of the variations of the app description, one variation of our three types of visual depictions and one combination of ratings and reviews. The order in which these components appeared was randomized as was the selection of the individual variant that was presented to a participant. After each of these stimuli participants rated their perceived e-Trust and after all three of these evaluations they reported their individual trusting disposition, transaction intention, and demographic variables.

4 Results

In this section, we report the results of our empirical investigation. We first focus on the characteristics of our sample and data screening procedures, then we report the steps that we have taken to ensure the reliability and validity of our measures, and finally we present the results of the statistical analyses we applied to test the hypotheses.

4.1 Sample Characteristics

In total, 2,158 individuals completed our online experiment. We removed responses with low engagement across our self-report measures (i.e., a standard deviation of 0). This procedure led to 116 responses being removed which resulted in a sample of 2,042 responses being used for our further investigation (there were no significant outliers that had to be additionally removed). About half of our respondents are women (51%) and 61% of our respondents are 30 years or older.

4.2 Reliability and Validity Analyses

In accordance with the recommendations by Homburg and Giering [31], we first conducted an exploratory factor analysis (EFA) to test the factor structure of our constructs using SPSS version 24. We took this step particularly to test whether e-Trust in the specific context of our investigation also comprises four distinct factors. As extraction method we used principal component analysis with promax rotation (we used an oblique rotation method as our constructs are likely correlated).

Barlett’s test for sphericity was significant indicating the appropriateness of an EFA for our data set and the KMO showed high potential for dimension reduction (.943). Including all self-report items, the initial EFA resulted in four factors explaining a total of 74.75% in variance of our data. In order to reduce crossloadings and to improve overall fit, we dropped items INTE1 and INTE2, which had the lowest loadings overall (INTE1: .580; INTE2: .544). This led to an improved overall explained variance of 75.83% and the factor structure depicted in Table 3.

Table 3. Factor loadings, average variance extracted and composite reliability

Interestingly, we did not find a suitable solution where all four distinct factors of e-Trust could be kept. Instead, a solution with only two distinct factors formed, with one factor including the indicators used to measure benevolence and integrity and the other factor including the indicators used to measure ability and predictability. We call the first of these factors “trust in vendor intentions” as it comprises items that mainly measure an app vendors potential faithfulness to keep promises, while we call the second factor “trust in vendor capabilities” as it comprises items that mainly measure an app vendors capabilities to actual fulfill a transaction, independent of certain intentions.

However, this finding is in accordance with more recent studies investigating the factor structure of e-Trust such as the study by Barki et al. [32]. They found that the two main motivational determinants of trust are the “can do” and the “will do” of an interactional partner which corresponds well with our findings (i.e., “trust in vendor intentions” reflects the “will do” component while “trust in vendor capabilities” reflects the “can do” component).

In order to conduct further tests on our measurement model and structural model, we then used SmartPLS version 3. In accordance with the recommendations by Henseler et al. [33], we first calculated SRMR as an approximate measure of model fit. Our calculated resulted in an SRMR of .048, which is below even a conservative threshold for SRMR of .05. For internal consistency, we assessed composite reliability and Cronbach’s α of our constructs. According to Bagozzi and Yi [34], composite reliability should exceed 0.60 and according to Nunnally and Bernstein [35], Cronbach’s α should exceed 0.70 in order to ensure internal consistency. As shown in Table 3, these criteria are met for all constructs.

We then evaluated convergent validity and discriminant validity of our constructs. For convergent validity we followed the guideline by Fornell and Larcker [36] that all item loadings should exceed 0.70 and the guideline by Kline [37] that the values for the average variance extracted (AVE) of each construct should be at least 0.50. As shown in Table 3 as well, we can confirm that our constructs and items fulfill these criteriaFootnote 7. Finally, in order to confirm discriminant validity of our measures, we tested for the fulfillment of the Fornell-Larcker criterion [36], which includes comparing the square root of the AVE of each construct and its correlations with all other constructs. Further, as recommended by Henseler et al. [33], we also calculated the HTMT for all constructs as a measure of discriminant validity. As none of the construct inter-correlations exceeds the square root of the AVE, and none of the HTMT values comes close to one (in parentheses), we can assume sufficient discriminant validity of our measures (see Table 4).

Table 4. Discriminant validity test

4.3 Hypotheses Testing

After confirming the quality of our measurement model, we then proceeded to test our structural model. As a prerequisite though, we first checked whether there were any multicollinearities between our independent variables (i.e., INTE and CAPAB) which could bias calculated path coefficients and as we collected all our data through self-reports we also checked for signs of common methods bias (CMB).

In order to test for multicollinearity we alternated our two independent variables and their dependent variable (transaction intention) in linear regressions in SPSS to see if their collinearity tolerances and VIF were acceptable (i.e., VIF values should be less than ten, while collinearity tolerances should be greater than 0.10 [38]). As VIF values ranged between 1.123 and 1.914 and collinearity tolerances between .522 and .891 we can assume that there are no signs of multicollinearity.

To test for common method bias, we applied Harman’s single factor test [39] which involved an exploratory factor analysis in SPSS without rotation, aiming for one single factor. This single factor explained only 44.62% of the total variance (more than 50% would be indicative of CMB), so we can assume that common method bias is no significant threat to our analyses.

To test the significance of our structural paths we used the bootstrapping method in SmartPLS 3 with 5,000 subsamples [33]. To calculate the path coefficients we then used the PLS algorithm also with 5,000 iterations. The results are depicted in Fig. 5. It has to be noted here that we reorganized our model due to the two-factor structure we found for e-Trust in our study context.

Fig. 5.
figure 5

Research model showing significance of relationships

Regarding our H1 which posited that components of e-Trust will predict transaction intention, we found support in both cases. Yet, the effect size is small in both cases (f2 for INTE: .009, f2 for CAPAB: .058) and only trust in vendor capabilities showed even a weak effect on individual transaction intention [40].

In accordance with H2, we found an influence of trusting disposition on e-Trust for both trust in vendor intentions (a weak effect with an f2 of .079) and trust in vendor capabilities (a moderate effect with an f2 of .160).

Interestingly, we found neither a direct effect of age nor gender on trusting disposition and therefore had to reject H3 and H4. As we were still interested in their potential effects on trusting disposition, we additionally investigated the possibility of moderating effects of age and gender on the relationship between trusting disposition and e-Trust. In both cases, the effect of age was not significant, though gender had a significant negative effect in both cases. This indicates that trusting disposition predicted e-Trust to a lesser degree in men than in women, though the effect size is negligible in both cases (f2 of .003 for the effect on trust in vendor intentions and f2 of .004 for the effect on trust in vendor capabilities).

Finally, we also investigated direct effects of age and gender on e-Trust. We found that all relationships in this context were significant, but only the relationship between age and trust in vendor capabilities conformed to H4. In all other cases, the opposite effect could be observed (i.e., older individuals reported higher levels of perceived trust in vendor intentions; women reported higher levels of e-Trust for both of its components). Hence, the data partially confirmed H4, while we have to reject H6. For most parts the effect sizes were negligible though (f2 of .004 for the direct effect of age on trust in vendor intentions; f2 of .005 for the direct effect of age on trust in vendor capabilities; f2 of .007 for the direct effect of gender on trust in vendor intentions), with the exception of the direct effect of gender on trust in vendor capabilities where we observed a small effect (f2 of .020).We summarized our findings in Table 5.

Table 5. Overview of findings

5 Discussion and Concluding Remarks

The main goal of our research was to investigate the role of initial trust in the context of mobile apps. We therefore conducted online experiments with stimuli that aimed for high external validity and a sample that consisted of participants from two different countries. We found that initial trust in an app vendor based on information that is provided on the landing page of an app significantly influences the transaction intention of individuals (i.e., the likelihood that these individuals will download or even purchase an app). Hence, a practical implication would be that trust evaluations of landing pages for mobile apps should become a business routine including A/B testing to find those elements of the specific landing page that can successfully influence the trust of potential users.

An interesting finding of our research is that in this specific context, e-Trust does not comprise four distinct factors as in the theoretical framework we based our research on [15], which was previously applied in the context of e-commerce transactions (i.e., Amazon transactions). The two distinct factors that emerged during our research are focused on individual trust in the capabilities of an app vendor and the intentions of the vendor to stay faithful to a given deal. We can only assume the reasons for this result, but it is likely due to the specific context of an app transaction. For example, while e-commerce transactions in general often still involve physical products being shipped, app transactions only involve software being transferred, therefore constituting a type of transaction with fewer steps in its fulfillment. Also, app transactions can often be deemed “micro-transactions” as most apps are either free or cost only a small amount of money. For these reasons, and perhaps due to other characteristics of app transactions as well, integrity and benevolence did not emerge as separate factors as there is less risk involved than in usual e-commerce transactions (e.g., no shipment errors or delays, lesser chance of problems during payment) and particularly benevolence therefore seems not as distinctively important, as there are fewer potential points of interaction with an app vendor. For comparable reasons, predictability might not have emerged as a separate factor, as the transaction itself is virtually similar across vendors mostly due to common exchange platforms (Play Store or App Store). Importantly, our findings are in line with recent evidence showing that the relationship between ability, benevolence, and integrity is non-linear [32].

Due to the specific type of app that we used for our empirical investigation (i.e., sport tracking apps) and the type of data collected (i.e., health-related information), we originally expected that integrity, which is part of our first factor trust in app vendor intentions, would be the construct that explains most variance of an individual’s transaction intention. However, this is not the case. Instead, trust in a vendor’s capabilities is a far stronger predictor of transaction intentions. This result may be due to our experimental setting which provided information to potential users that allowed them to judge how useful a specific app could be to them and therefore emphasized the benefits and functionality of an app. Integrity of the vendor or information privacy measures were probably not of immediate importance and therefore received lower levels of consideration. In addition, it is possible that the so-called privacy paradox may have an effect on this type of judgement. In previous research on information privacy it was found that while people would rate information privacy as highly important to them if directly asked, they also easily exchange private information for some immediate benefit such as personalization of services or monetary rewards [41, 42].

Another unexpected finding was that men reported lower levels of e-Trust than women, though the size of the effect was small (avg. INTE for men: 4.79 vs avg. INTE for women: 4.90; avg. CAPAB for men: 4.73 vs avg. CAPAB for women: 4.97). Although we expected that women would be more risk averse and therefore not as trusting towards our fictitious vendor based on the given app information, men were less trusting in our specific context. We can assume that this effect was caused by the specific research context, as we found no general effect of gender on trusting disposition. One potential reason for this difference is that the advertising images that we used as part of our stimulus material only depicted women, which led to female participants reporting higher levels of trust as they were directly targeted by these images. Yet, only about one third of our participants saw this specific stimulus and this explanation is therefore not sufficient to explain our results. It would therefore be interesting to investigate this specific finding further in order to find out which specific individual or contextual characteristics might have caused this difference (e.g., by using different types of apps as stimulus and taking personality characteristics into account).

Like all research, ours also has its limitations, which will be detailed now. We designed our study as an online experiment and distributed it indiscriminately amongst German-speaking and English-speaking audiences. As cultural differences could have had an influence on our results, not asking about the respondents’ nationality or dimensions of their cultural background can be regarded as a limitation. In addition, our research only focused on a small number of constructs in our investigation. Hence, future research could, for example, investigate the influence of further characteristics of a specific app and their influence on e-Trust such as the cost of apps [27]. In addition, the actual formation and level of initial trust could be clarified if apps from unknown vendors were directly compared to apps from vendors that were at least known based on their brand [18]. Still, despite these limitations, we can conclude that trust also plays an important role in the initial trust formation in the context of mobile app transactions.

It follows that future research should investigate the factors that drive e-Trust and that can be controlled by app vendors (e.g., the influence of the different landing page elements that we have included as well as further elements), but also contextual factors such as the influence of different cultures, experience with different mobile operating systems or the type of app that is advertised on the landing page.