1 Introduction

Mobile sales by the leading 500 retailers grew 70 % in 2013 and are expected to grow another 80 % in 2014 in the Internet 500 mobile study. Global mobile data grew 81 % in 2013 and 45 % in 2014 [1]. The ardent expectation of the coming mobile commerce age spurred by these statistics, however, needs to consider a potentially limiting factor inherent in the mobile phone device, i.e., the small display size.

Small screen size is likely to reduce users’ effectiveness on the task and increases the needed navigational activities [2, 3]. This unfavorable feature may interact with the multitasking and resource competition condition faced by mobile phone shoppers and deters them from shopping on the go (cf. [4]). Does the small screen size of mobile phones affect other dimensions of shopping related processing? The current study focused on one such possibility, i.e., the smaller spatial range of visual attentional processing due to the small screen size restricts the scope of conceptual processing [5]. It is hoped that the examination of this unexplored behavioral dimension of screen size and its consequence on shoppers’ evaluation of products and advertisements may contribute to current knowledge of online store design.

2 Liturature Review

Earlier studies showed that the small display size of mobile phones adversely affects performance. Comparing PDAs and desktop computers, longer task completion times and lower task success rates in association with small than large screen sizes were found by [6]. Reduced screen size (1.65 in. vs. 2.65/3.78 in.) impaired the effectiveness of video-based mobile learning [7]. The menu selection performance was compared between a large and a small screen (i.e., 800 × 600 vs. 240 × 320 pixels [8]. Task performance was more efficient (faster task completion time) and users’ memory recognition (awareness) of the menu items was somewhat better for the large than the small screen.

Some recent studies found more limited effects of screen size on task performance. Reference [9] evaluated the effect of mobile phone screen size using three different sizes (i.e., 3.5, 4.3 and 5.3 in.) and information retrieval tasks. The screen size did not affect perceived usability (i.e., SUS scores), nor did it influence effectiveness (i.e., task completion rate). Larger screen size increased efficiency (i.e., task completion time) when the specific task required more interactions (more difficult). In a similar vein, J. Kim [10] found that it was more difficult for users of small than large screen sizes (i.e., phones vs. computers) to extract information from the search results page though the search performance of different screen sizes was equivalent. Whether screen size affects the success or efficiency dimension of task performance appeared to depend on the absolute level of sizes, nature and complexity of the task, among others.

Although the effect of screen size on task performance could be more subtle than expected, there could be other cognitive processing dimension that is impacted by size. The visual exploration on a small than a large screen spans lesser spatial extent and a narrower perceptual/attentional scope. Narrow attentional scope prompts the individual to maintain a conceptually limited array of information while wider scope enhance the allocation of resources to distal, less relevant information [11, 12]. Reference [5] provided experimental evidence supporting that the spatial extent of perceptual search affected one’s conceptual scope. They asked participants to search for ‘3’s in arrays of digits. When the digits spread out over a wider area, participants exhibited broad conceptual scopes enabling them to subsequently generate more original uses of a brick and more original category exemplars. The current study thus hypothesized that the small size of an App store interface on a phone engenders narrower perceptual and conceptual scope compared to the large interface of the Web store on a desktop computer (H1, see Table 1).

Table 1. Hypotheses examined in the study

Furthermore, as the user with a narrower processing scope devotes processing resources on the limited array of information [11, 13], he is expected to accomplish the task with less effort than when his processing scope is wide. Thus fewer gazes and less gaze duration are needed for the App store than the web store interface, given the equivalence in the processed products and user task demands for the two types of stores (H2, see Table 1).

When the processing scope is wider than narrower, the user is likely to attend to information presented peripheral to the products such as advertisements. As a result, there is expected to be a greater number of gazes fallen on the ads and the total gaze duration on the ad is expected to be longer when the processing scope is wider than narrower (H3, see Table 1).

Previous research on ad placement had found that advertisements displaying products related to the central context of the website were generally more effective than irrelevant ads. The click-through rate, ad attitude, ad memory, purchase intention were higher when the advertised products were related to displayed products [1416]. Reference [14], for example, compared the evaluative responses towards the banner ad (e.g., student loan banner ad) that was highly relevant to the displayed website (e.g., online student financial loan service website) vs. low relevance banner ad (e.g., a computer branding ad), controlling familiarity of website and brand names as well as the locations of the ad. An ad relevance effect was found – high relevance ads were liked more and exhibited higher purchase intention over low relevance ads while memory of the ad was not affected by relevance. Relevant ads are perceived to be related to the current user goal in her interactions with the website and were thus more positively evaluated [17]. We expected similar ad relevance effects for banner ads in this study. However, the small display size of the App store interface is expected to narrow the conceptual scope of the participant making it more difficult for her to see the relevance or similarity between the advertised products and the website products. The ad relevance effect is thus smaller for the small/App than the large/web interface (H4, H4a, H4b, see Table 1).

3 Method

3.1 Participants

Forty female undergraduates aged 19–24 participated to receive partial course credit. Half of them participated in the large/web and the other half in the small/App condition. Their online shopping experience was at least three years and they shopped online once every month on average.

3.2 Design and Materials

The study is an interface size/type (small/App vs. large/web) x ad relevance (low vs. high relevance) design with size/type as a between subject factor and relevance as a within subject factor. The product array was constituted by nine clothing presented in a 3 × 3 grid and the target product (a black jacket) randomly appeared at one location. These products were collected from the Internet controlling clothing prices US$12 ~ $30. Beneath the product array was a banner ad displaying either high relevance (i.e., clothing/accessary or shoes) or low relevance products (electronics/computer, food).

Forty different product arrays and forty different banner ads (ten for each of the four product categories) were generated. The combination of the specific target product and the banner ad category were counterbalanced so that each product appeared with the four different banner ad categories equivalently likely across participants in either the web or the app store condition. Each participant viewed a total of forty App or web store images, extended approximately 11 or 32° of visual angle in width each respectively. These two sizes are equivalent to the approximate screen visual angles of a 5-inch mobile phone viewed at the distance of 35 cm and a 17-inch monitor at 60 cm. A small/App interface image was constituted by the nine-product array and the banner ad while there was additional navigation bars and website ads placed to the left and right of the product array in the large/web interface image (see Fig. 1).

Fig. 1.
figure 1

An example of the large/web store (top) and the small/app store (bottom) interface

Participant responses were evaluated using the following items. The numbers in the brackets represent whether the response was taken during phase 1 or phase 2 of the experiment (see Procedure). Nine-point scales were used in all except for the manipulation check for the conceptual scope:

  • Manipulation check for the conceptual scope (1): Items were selected from the Chinese Remote Association Test (CRAT) [18] – a Chinese version of the Remote Association Test [19]. Each item was comprised of three Chinese characters and the participants responded with one character that could constitute a two-character word with any of the two or three characters. Ten items were selected from the CRAT so that they were of the highest frequencies for the words constituted and of similar character locations in the words, i.e., the easier items in the CRAT to facilitate sensitivity to the manipulation.

  • Manipulation check for ad relevance (2): The banner ad is relevant to the webpage content, The banner ad is related to the features and functions of the webpage.

  • Product purchase intention: I may purchase this product (1), for the store products; I may purchase the product in the banner ad (2), for the ad products.

  • Product attitude: I like the product (1), for the store products.

  • Ad click intention: I’d like to click on the ad (1).

  • Ad attitude: I like the banner ad (1), The banner ad is pleasant (2).

3.3 Procedure and Apparatus

The participant was seated 60 cm in front of the Tobii T60 (60 Hz) eye tracker in a quiet room when her eye position was tracked. The participant was asked to imagine that she was shopping online, looking for a black jacket. She was supposed to click on the black jacket in each product page and answer questions concerning the products and ads. The participant’s eye position was calibrated using a five-point calibration procedure was repeated as needed according to Tobii’s procedure.

The trials were then presented in random orders to the participant. In each trial, the App or web store image was presented as long as needed until the participant clicked on the target product (i.e., the black jacket) to terminate the presentation of the image. The webpage product purchase intention, webpage product attitude, ad attitude, ad click intention questions and rating scales were then presented consecutively on the screen for the participant to indicate her rated response with the mouse click. After they completed the forty trials, the participant responded to the manipulation checks for cognitive scope.

The forty app/web store images were presented again for rated purchase intention (advertised product), ad attitude, ad relevance manipulation checks. Participants then completed an online shopping experience questionnaire.

4 Results and Discussion

One participant was excluded from the analysis because of the partial loss of her data.

4.1 Manipulation Checks

The conceptual scope was measured by the numbers of associations generated on the CRAT. The mean number of associations was significantly more for the large/web interface than the small/App interface, M = 8.15 vs. 6.53, t(37) = −4.92, p < .05, suggesting larger conceptual scope for the large/web than the small/app interface, supporting H1.

The two ad relevance ratings were averaged and submitted to a t-test, M = 6.56 vs. 3.38, t(37) = 20.31, p < .05, showing higher relevance ratings for high relevance (accessory and clothing) ads than low relevance (phones/computers and foods) ads.

4.2 Ad and Product Measures

The size x relevance mixed design two-way ANOVA was performed on the respective dependent measures and there was not any significant effect. The feedbacks from the participants suggested that food advertisements were strongly capturing their visual attention. Heat maps for the four categories of ad products show this difference (see Fig. 2). The mean total fixation times for the food ads was 2.73 s. while it was 1.6 s., 1.69 s., 1.55 s. for phones/computers, accessory, clothing ads respectively. As foods may have automatically captured attention that diminished the ad relevance effect, trials with food ads were excluded from the subsequent analysis.

Fig. 2.
figure 2

Heat maps for the four types of banner ads in a small/App interface example: (from left to right) food, phones/computers, accessory, clothing.

The dependent measures were again submitted to the size/type x relevance ANOVAs. Ad relevance did not affect participants’ attitudes and purchase intention for the webpage products. Significant effects of ad relevance were found for ads and advertised products: The willingness to click on the ad, F(1,37) = 6.99, p < .05, ŋ 2p  = 0.16, purchase intention of advertised products, F(1,37) = 45.41, p < .05, ŋ 2p  = 0.55, average ad attitude (the average of the two ad attitude questions), F(1,37) = 5.23, p < .05 ŋ 2p  = 0.61, ad attitude 1, F(1,37) = 3.71, p < .05, ŋ 2p  = 0.46, and ad attitude 2, F(1,37) = 22.49, p < .05, ŋ 2p  = 0.38) were all higher for high than low relevance ads. Participants were more willing to click on the ad and purchase the products in the ad and had better ad attitude toward the ad when the ad was of high than low relevance to the webpage products, supporting H4a.

Ad relevance interacted with interface size/type for purchase intention of advertised products, F(1,37) = 10.73, p < .05, ŋ 2p  = 0.23, average ad attitude, F(1,37) = 5.04, p < .05 ŋ 2p  = 0.59. The simple main effect analysis showed that the purchase intention was higher for high than low relevance ads for the large/web store interface, F(1,19) = 38.14, p < .05ŋ 2p  = 0.66, while ad relevance did not affect purchase intention for the app store interface, F(1,18) = 2.20, p > .05. The simple main effect analysis for average ad attitude showed that ad attitude was higher for high than low relevance ad also for the large/web, F(1,19) = 13.30, p > .05, ŋ 2p  = 0.93, but not the small/App store interface, F(1,18) = 0.001, p > .05, (see Fig. 3). These findings supported H4b. Interface size/type x ad relevance interaction was also significant for ad attitude 1, F(1,37) = 7.62, p < .05 ŋ 2p  = 0.17, but not for ad attitude 2, the willingness to click on the ad, attitude and purchase intention of webpage products.

Fig. 3.
figure 3

Mean ratings of purchase intention (on the scale from 1 (not at all) to 9 (very)) for advertised products (top) and average ad attitude (bottom) as the function of ad relevance and interface size/type. The error bars were standard errors.

The main effect of interface size/type was significant for purchase intention of advertised products F(1,37) = 6.73, p < .05, ŋ 2p  = 0.15 and ad attitude 2, F(1,37) = 4.98, p < .05 ŋ 2p  = 0.12, but not for any other measures. Participants were more willing to purchase the advertised products and felt the banner ad was pleasant when the ad was displayed in the large/web than small/app interface.

4.3 Gaze Measures

The total gaze duration, total number of gazes, gaze duration on the banner ads and the number of gazes fallen on the banner ads were respectively analyzed using the interface size/type x ad relevance ANOVA. Total gaze duration: The duration was longer for high than low relevance ad, F(1,37) = 14.22, p < .05 ŋ 2p  = 0.315. The interaction effect F(1,37) = 1.80, p > .05, and the main effect of interface size/type, F < 1, was not significant. Total number of gazes: the main effect of ad relevance, F(1,37) = 2.91, p > 0.05) was not significant. The effect of interface size/type was marginally significant, F(1,37) = 3.66, p = .06, showing a greater number of gazes on the large/web than the small/App interface (see Fig. 4). Total gaze duration on banner ads: no significant effects. Number of gazes on the banner ads: no significant effects. To conduct the task of target product search, participants engaged fewer gazes when the store interface was small/App than large/web but their gaze duration did not differ between two interface size/types, partially supported H2. Neither the number of gazes nor the total gaze duration varied with the interface size/type. H3 was not supported.

Fig. 4.
figure 4

Total gaze duration (top) and the number of gazes (bottom) as the function of ad relevance and interface size/type. The error bars were standard errors.

4.4 Discussion

The current findings showed that users’ processing scope differs when they shop in a large/web compared to a small/App store interface due to the size difference between the devices. The narrow scope of processing due to the small spatial extent in the App store interface diminished the ad relevance effect and reduced the number of gazes deployed in the target product search task.

Such effects of display size contrast with previous effects of screen size on performance [610]. The viewing of the small/App interface was not more effortful than that of the large/web interface. In fact, our participants used fewer, not greater, numbers of gazes to locate the target products in the former than the latter. This is likely due to the fact that the visual angles of the product array and the banner ad were kept similar between the large/web and the small/App store interface in the current study. Both of them were approximately equivalent to that of a 5-inch phone at a hand-held distance. As such, the content and legibility of the critical product display and banner ads could be little disadvantaged toward the small than the large display. This situation is characteristic of the real world conditions considering the prevalence of phones such as iPhone 6 plus, Samsung Galaxy Note 3, 4, Sony Xperia Z Ultra, all of which have large screens well over five inches. The current practice of App design using large product images, limited texts as well as the hided/moved/reduced navigation scheme results in clear and clean App store interfaces. The current findings point out, however, that the narrower spatial processing extent of the small/App interface still cast an effect on users’ processing scopes that, in turn, impacts shopping behavior.

As a result, the App store user characterized by the narrow scope of processing is more “focused” in terms of allocating resources on the current task than the web store user. Such a focused shopper is incompatible with the task of interacting with external environment, helping account for the limited shopping conducted on the go using mobile phones [4]. Our finding that the small/App interface was associated with fewer gazes thought not shorter gaze duration is consistent with this “narrow and focused” picture of the App store user. It is interesting to note that a study comparing search performance on small and large screens similarly found that eye movements on the small screen were more limited and the visual scanning patterns were narrower than those on the large screen [10].

Current findings also suggest that ad placement in an App store interface may require different considerations as that in a web store interface. On the one hand, context relevance is no longer a guideline as critical as in the web store interface. It awaits future study to clarify if an ad relevance effect defined in terms of self-relevance [17] will be less susceptible to the narrow processing scope of the App interface. On the other hand, participants actually had a lower purchase intention and attitude towards the advertised products and ads on average in the small/App than the large/web store interface even if the visual information in the two interface size/types was comparable. The large/web interface allowing a wider processing scope may have enabled the viewer to assimilate both relevant and less relevant ad products with the product category in the product array, increasing the evaluation towards the ad and advertised products.

Current findings are limited in terms of the fixedness of banner ad location, the simulation of the mobile phone screen on a computer monitor as well as the interaction response with mouse clicks rather than touches and swipes. However, the consistent effects found suggest the needs for future studies to exemplify other dimensions of user behavior associated with display size.