Keywords

1 Introduction

In contemporary information societies, it is a well-known fact that survey researchers have met growing difficulties with traditional recruitment methods linked to rising data collection costs. Researchers have begun to look primarily to so-called mixed mode approaches in order to have representative response rates in collecting large-scale datasets. This so-called mixed mode data collection, which refers to a combination of two or more collection methods being offered for responding to a survey, has become more common in recent years. Web surveys and mail/web mixed-mode surveys, in particular, are increasing in prevalence. Thus, it is crucial to acknowledge whether data collected through different modes can actually be combined into one cross-sectional study [1, 2].

The use of mixed-mode survey design is based on the idea of the equivalence of data collection. This means that researchers who utilize mixed-mode surveys aim at combining data into one data matrix that can be analyzed. However, there are several threats related to the integrity of mixed-mode data. Researchers have noted that the use of web-based surveys is problematic in a number of ways. The primary problem relates to the selection of respondents when compared to traditional recruitment methods [3,4,5]. In a mixed-mode survey, it is likely that respondents’ answers will vary according to response mode. This is not a problem if the differences are related to different demographic background characteristics of the respondents that can be controlled. However, if the response mode in itself (Internet vs. mail) leads to different response processes and, thus, has an effect on the responses, the problem of data integrity is real [6].

It has been argued that the risk of social desirability bias is lower in self-administered surveys compared to interviewer-administered surveys (for a review see [7,8,9]). As such, one of the main reasons to use self-administered, and not interviewer administered, mixed-mode surveys is to reduce social desirability bias [6]. Yet, little is known about the difference in prevalence of social desirability bias between mail respondents and web respondents in mail/web mixed-mode surveys.

In this paper, we analyze differences in responses to sensitive attitudinal questions between two modes of data collection. We ask whether attitudes toward immigration are different in a population-level survey between data collected via mail and data collected through the Internet. Our data are derived from the Finnish section of the International Social Survey Program (ISSP) 2013 (n = 1,243).

We assume that the chosen response mode indicates qualitative selection of respondents to some extent, especially in terms of digital lifestyle. On the contrary, choosing the paper questionnaire may itself be an indication of lacking digital skills. The possible differences potentially cause the findings to suffer from both coverage and measurement errors. For instance, if differences exist, then the prevalence of combined data collection may distort the data sets used in making sample-to-population generalizations in survey research. In addition, if there are biases in responses between the two response modes, it may challenge previously proposed results indicating that individuals from politically marginal groups are more open to express their views online than offline.

Finland provides an interesting research context, since the central population register allows for drawing reliable random samples of different population groups. Finland is also a leading information society, especially in terms of Internet penetration rates among citizens. In terms of practical implications, our findings demonstrate that the mode of response matters in population-level surveys. Our study also provides relevant discussion regarding recent research findings on socially and politically sensitive issues.

1.1 Online Surveys

Growing Internet access plays an important role in the development of social surveys. In the second decade of the 21st century, Internet penetration rates had exceeded 80% in the greater part of developed countries [10]. When people spend more time on the Internet, many important social phenomena exist in online platforms exclusively. Thus, a mixed mode approach is part of the solution to challenges arising from the use of new communication technology.

Due to changes in the population’s ICT use-habits, efforts have been made to offer the opportunity for sampled respondents of traditional postal surveys to respond to surveys via the Internet. However, it is often in the researchers’ best interest to utilize a mixed-mode survey design. From a purely cost effectiveness perspective, surveys conducted online seem to be an overwhelming favorite when compared to traditional data collection methods. When compared to traditional data collection methods, online surveys are, in principle, faster and cheaper to implement [11]. Furthermore, survey research is able to, at different stages, take advantage of combined data collection methods: in the recruitment of respondents, the actual data collection, recollection of responses, or, in several stages in research, in the next collection phase after the initial dunning for example [12].

It has been noted that the use of web-based surveys is problematic in a number of ways, with the primary problem being in the selection of respondents when compared to traditional methods [3]. From the perspective of data quality, this is a particular problem, because non-responses resulting from selection is difficult to correct with weighting, potentially causing the findings to suffer from coverage errors [4, 13].

In mixed-mode data collection conducted at a population level, online response is often treated as a supplementary data collection method, which reduces the risks associated with access to the network. On the other hand, the selection can be qualitative, which also poses challenges for mixed-mode data collection.

But how can the response mode affect answering qualitatively? It has been found that online respondents tend to be more skilled Internet users than those who choose an alternative response option [14]. Accordingly, it is possible that qualitative differences go back to differences in respondents’ socio-demographic backgrounds. For example, research lead by Dillman [15] examined variances between different response methods. It was found that there were differences between the respondents of online and paper questionnaires according to socio-demographic profiles. However, the research did not find any connection between response modes when examining attitudinal level variables [15]. Rather, Mark Saunders [16] noted the connection between response methods and attitudinal variables in his examination of British public-sector workers. In a same-sized sample carried out at the same time in the survey, web-based respondents reported that they were more committed employees, had more confidence in the organization’s leadership and received more support than those who responded with the paper survey [16].

However, neither Dillman’s [15] nor Saunders’ [16] research had the effects of standardised background variables nor was the interaction between response methods and background variables analysed. It is conceivable that the socio-demographic variety of web-based respondents indirectly influences the analytical methods variables’ fluctuation. It is impossible to state unequivocally, for example, that the differences presented by Saunders [16] are actually due to professional status, which can increase online response probability compared to responding via the post.

Atkeson et al. [17] analysed mixed-mode data collection methods and the differences between web-based and paper respondents as well as the link between response methods to political behavior. The descriptive analysis of the data demonstrates that Internet and paper respondents, to a certain extent, differ from each other on the basis of demographic factors. Paper respondents appear to belong to lower income groups, to be older and to have lesser education than Internet respondents. On the basis of preliminary analyses, response groups’ political behaviour varied on several different levels. However, after controlling for socio-demographic factors, there were no differences found in political behaviour among the response groups [17].

The significance of response methods on the basis of age has also been researched with restricted data sets. For example, Sean McCabe [18] has, with the aid of combined data collection, surveyed alcohol use among American college students. Alcohol use did not systematically vary according to the response method, but the same demographic trends seen in other presented research results were observed. Internet respondents were slightly younger and more often male than paper respondents (ibid). Dana de Bernardo and Anna Curtis [19] reached the same results in their assessment of suitable response methods for those over 50 years old. On the basis of mixed-mode data collection, they found that Internet respondents were more educated, more well off and also slightly younger than paper respondents.

Similarly, in Finland, Koivula et al. [20] have compared characteristics of both mail and web respondents. In addition to a demographic comparison, they analyzed potential response effects on measurements used generally in well-being research. According to the results, the mode of response was still crucially associated with age. However, it was not strongly associated with the measures of subjective well-being after controlling for respondents’ background.

A recent study by Kim et al. [1] suggest that in mail-web mixed mode surveys, more attention should be paid on the phenomenon called straightlining. Straigtlining refers to respondents’ tendency to choose identical or nearly identical research options for all items in a question battery. Although Kim et al. did not find differences in straightlining behaviour between mail and web respondents in their study, mode effect on straightlining is a potential threat to data integrity and quality of mail/web mixed-mode surveys.

1.2 Social Desirability Bias

One could argue that the mode of response may have an effect on certain types of sensitive questions. More generally, sensitive questions, such as immigration attitudes, tend to lead to higher nonresponse rates and/or larger measurement errors in responses than questions on other topics. This is true especially when it comes to public opinions that relate to ‘political correctness’ or ‘social desirability’.

Drawing valid survey data containing sensitive information dealing with respondents’ private, political and illegal issues, is traditionally seen as a challenge in survey research [21, 22]. In general, sensitive questions tend to lead to higher nonresponse rates and/or larger measurement errors in responses than questions on other topics [23]. This is true, especially when it comes to public opinions that relate to ‘social desirability’. Added to this, surveying some issues such as income level, religiosity, or sexual behaviors may be considered culturally too delicate, or even taboo. Respondents may also feel offended if their personal privacy is challenged by intrusive questions [24]. It is well known that the risk of social desirability bias is clearly higher in interview-administered than in self-administered surveys [7]. In line with this, Klausch et al. [2] concluded in their recent study that interview-administered survey modes (i.e. face-to-face/telephone) that include attitudinal rating scale questions should not be used in parallel with self-administered survey modes.

Nevertheless, the feeling of privacy might be context-dependent and relate to a sense of anonymity that can vary according to the research environment, i.e. the online vs. offline environment. Here, the advantages of online survey techniques can be considered useful. It has been observed that pen-and-paper and web-based surveys do not yield identical results, particularly when examining sensitive questionnaire items [25, 26]. Kays et al. [27], for example, found that respondents tend to be more likely to answer sensitive questions on the Internet compared to a pen-and-paper option. Web participants’ feeling of anonymity likely plays a role here [5].

Differences can also result from the measurement errors when filling in the online questionnaires. Namely, it has been put forth that online surveys might be understood by participants in a different way than other methods used in conducting surveys [28]. In addition, it has been noted that online respondents more readily answer ‘I don’t know’, as well as elicit more non-differentiation on rating scales [29, 30].

In this study, we focus on the contextual impact of outcome variables when comparing two modes of response. The aim of the analysis is to assess whether the response mode divides respondents along specific issues. We examine this issue with the aid of subjective questions regarding immigration. Immigration is commonly considered as a socially and politically sensitive topic in Western countries, since it connects easily with such historical events as colonialism and the holocaust, as well as contemporary racial tensions in many countries [21, 31, 32].

2 This Study

We assume that Internet respondents differ from paper respondents in terms of sensitive questions related to social desirability bias. In order to control for the sensitivity of the immigration question, we conduct different measures for negative and positive immigrant attitudes. The questions we use are sensitive in nature, as they can relate to crime or unemployment rates of immigrants. Therefore, we need to consider certain additional features to normal sources of reporting errors (e.g. misunderstanding the questions, lack of relevant information). Preceding research has also pointed out that under circumstances considered as sensitive, respondents simply do not wish to tell the truth.

H1: We hypothesize that postal mode will elicit stronger attitudes towards immigrants, especially when examining negatively formulated questions.

It has been suggested that socially desirable responses in surveys vary between economic and socio-demographic groups, in addition to variance linked to features of the data collection situation such as the degree of privacy [21]. While a smaller body of research has focused on the demographic factors behind response bias, the studies point to at least one consistent finding; namely, that educated and younger individuals tend to express less conservative attitudes than those with less education and lower economic position. For example, it has been noted that respondents with higher educational qualifications are more willing to give social approval to new political ideas while also tending to give less truthful responses concerning their own activities [33]. If there is a significant association between background variables and response mode, we can partially link response mode effects to respondents’ qualitative selection in terms of differential Internet usage. Here, we need to also acknowledge that recent research has reported that there are still notable socio-economic differences in Internet access and use patterns [34, 35]. This also applies to experiences on the Internet, for instance in terms of how useful or harmful online material can be for users [36].

H2: We assume that the key background variables have a confounding effect on the detected associations between mode of response and attitudes towards immigration.

3 Research Design

3.1 Data

The data are derived from the Finnish section of the International Social Survey Programme (ISSP) 2013 (n = 1,243). A total of 2,500 participants aged 18–74 years were approached using a mixed mode collection by offering the opportunity to respond online instead of the paper form. Finland was the only participant country in which this kind of response mode option was applied in the ISSP 2013 data collection. Here, it is noteworthy that all respondents were contacted in a same way, by conventional mail, and given the opportunity to answer the questionnaire online [37].

The original sample was selected from the Central Register of Population using random sampling. The survey yielded a response rate of 48%. The final sample including a total of 1,208 observations consists of 695 web-responses and 513 mail-responses.

3.2 Variables

In the explorative analysis, we first focus on the variance of attitudinal variables according the response mode. We established two summed variables to measure attitudes towards immigration that is typically found to be sensitive to social desirability. The variables measure either positive or negative attitudes towards immigration. Given the fact that negative connotations in the attitudinal variables are less desirable than positive ones, our two measures enable us to control the sensitivity of questions dealing with immigration. Descriptive statistics of dependent variables is shown in Table 1.

Table 1. Descriptive statistics for dependent variables

Let us first examine negative expressions. The merged variable consists of three value-based questions, each of which presented a fairly negative tone. Initially, respondents were asked to choose their opinion from the 5-point Likert scale from strongly disagree to strongly agree. In addition, there were choice options for undecided respondents. Firstly, respondents were asked for their opinion on the statement “Immigrants increase crime rates”. The second item refers to employment issues, “Immigrants take jobs away from people born in [Country]”. The last item considers respondents’ opinions regarding the effect of immigration on national culture, namely “Immigrants undermine culture”.

The second dimension was constructed on the basis of three items addressing immigration in a more positive way. Initially, they were assessed with a 5-point Likert-type scale, varying from strongly disagree to strongly agree. Here, the first item refers to economic issues, as respondents were asked to respond to the question “Immigrants are generally good for the economy”. The second item assesses cultural issues, as “Immigrants bring new ideas and cultures”. Finally, the last question concerns immigrants’ equality with native citizens, as “Legal immigrants should have same rights”.

We also controlled five independent variables: age, gender, education, economic activity and place of residence. The descriptive information of variables is shown in Table 2 by the response mode. Age was categorized into five groups: Under 30 years, 30–44 years, 45–54 years, 55–64 years and 65–74 years. The years from 18 to 30 are often referred to as early adulthood. The next three age groups can be defined as early middle age, middle age and late middle age respectively. Finally, people over 64 years of age are often characterized as the elderly, since the person is usually entitled to a pension after the age of 63 in Finland.

Table 2. Descriptive statistics for independent variables by response mode

3.3 Analysis Strategy

First, we estimated variation in attitudinal variables according to response mode, taking advantage of the ordinary least squares (OLS) and analysis of variance (ANOVA). Secondly, we tested the indirect effects of the response mode through background variables by using Sobel-Goodman mediation tests. We illustrated the main results in Fig. 1 by using coefplots [38]. The analyses were performed with Stata 15.

Fig. 1.
figure 1

Predictive margins (with confidence intervals) for response modes when explaining positive and negative attitudes towards immigrants. The direct and total effects are based on OLS regressions in Table 3.

4 Results

The results of regression analysis are shown in Table 2. The main results are illustrated in Fig. 1 by estimating predictive margins on the basis of OLS models from Table 2.

Let us next examine the direct effects of response mode. The web response mode had a significant effect on negative attitudes (b = –0.19; p < .001). When analyzing positive attitudes, the results were similar, but not statistically significant (b = 0.08, p > 0.05).

Next, we added covariates into the base model. The covariates significantly confounded the association with response mode and negative immigration attitudes (b = 0.06, p < 0.001). What is noteworthy, however, is that the effect of response mode remained significant (b = –0.11, p < 0.05). The squared R indicates that the final model explains negative attitudes relatively well (0.14). In the case of positive attitudes, the effect of response mode was not significant in the models. The predicting power of the adjusted model was also more modest when compared to the model predicting negative attitudes.

We also estimated the direct effects of the covariates. Age did not have a significant effect on either of the dimensions. However, women’s scores were more positive (b = 0.16, p < 0.001) and less negative (b = –0.28, p < 0.001) towards immigration. The level of education contributed clearly to the negative attitudes, as those having tertiary education expressed clearly less negative attitudes (b = –0.77, p < 0.001). This finding applies to the data when those having only primary education are omitted. In terms of positive attitudes, the differences were similar but lighter, as those having tertiary education reported more positive attitudes (b = 0.47, p < 0.001). Economic activity also had a significant effect on the both dimensions when students, retired and unemployed to employed persons were compared. Students reported more positive (b = 0.38; p < 0.001) and less negative (b = –0.22; p < 0.05) attitudes towards immigrants than others, although those who were economically inactive reported more negative attitudes (b = 0.32; p < 0.001).

Finally, we tested the indirect effect of the response mode through covariates. The results of this procedure show us whether or not the response mode effect is caused by demographic variables. We found that education confounded the response mode effect. The effect was stronger in terms of negative attitudes (b = –0.06, p < 0.001), which means that education (mainly tertiary education) explained 30% of the direct association of the response mode. In this case, education has a notable confounding effect that contributes to both response mode and immigration attitudes. However, what is noteworthy here is that the effect of response mode remained statistically significant after controlling for education (Table 3).

Table 3. Predicting positive and negative attitudes towards immigration by response mode. Directed, total and indirect effects

5 Discussion

In this paper, we were interested in whether the mode of response was associated with attitudes towards sensitive issues in mail/web mixed-mode surveys. More specifically, we examined whether respondents who choose answering via a web-questionnaire differed from postal questionnaire respondents in terms of social desirability bias. We used positive and negative questions on immigration to measure social desirability. Preceding research indicates that responses to questions concerning immigrants might be biased [31, 32]. In the survey data utilized in this paper, online response was an alternative to postal questionnaire. The sample was selected from the Finnish Central Population Register using random sampling. The respondents were contacted by mail and given the opportunity to answer the questionnaire online.

As assumed (H1), responses in mail survey mode differed when we examined attitudes towards immigration in more sensitive (negative) way. The finding was similar for less sensitive (positive) attitudes, but the response mode failed to have a statistically significant effect. Furthermore, we were also able to confirm our second hypothesis (H2), as controlling for basic socio-demographic factors confounded the association between negative answers and response mode. However, all of the detected variation could not be attributed to the control variables, indicating that differences in responses is more likely dependent on other unobservable factors associated with the responding.

The findings call for a broader discussion. As online surveys have become a primary data collection technique in most disciplines, it is crucial to assess the strengths and weaknesses of online survey data collection in comparison to more traditional methods. In the mixed method context, the strengths of online surveys are easy to identify. Especially as a supplementary data collection method, online questionnaires offer advantages to both researchers and respondents. A combination of postal and internet-based modes gives the sampled respondents a possibility to select the questionnaire form he/she prefers. The Internet questionnaire also saves respondents from the possible inconvenience of mailing the questionnaire form. For internet-savvy population groups, the online questionnaire is a natural way to answer surveys. Without giving this opportunity, these individuals might not answer the survey at all, which researchers cannot afford.

The fact that we found notable differences between the answers given by online and offline respondents should encourage researcher to conduct web-based data collection more carefully. The selective bias related to online responding does not appear be a significant problem after controlling for the effects of basic socio-demographic variables. Despite this, however, it appears that certain population groups are poorly represented in web-based surveys. Thus, at its best, web-based data collection can only offer approximate results. This being the case, a combination of a postal survey supplemented with a web-questionnaire seems to have its place in survey research in the foreseeable future. As Bech and Kristensen [39] have further pointed out, due to the lower response rates, the costs per an online response can actually be significantly higher than in the case of postal response.

Our study obviously has its methodological and practical limitations. In particular, the issues regarding the data samples, research design and the survey items analyzed are notable here. First, we presented generalizations to a larger population on the basis of a cross-sectional dataset from one Nordic country. With this in mind, it is important to acknowledge the restrictions that deal with the specific nature of the nation surveyed. Second, the data came from a cross-sectional survey. We therefore cannot make strong statements on the causal direction of the response modes. In order to establish a plausible causal interpretation, we would need a panel dataset containing observations from the same respondents in different points of time. Third, we were not able to control the possible effect of digital platforms used in the online response mode. It is likely that the responses may sometime differ even between computer and mobile phone interfaces, for instance.

Finally, to have a better understanding of sensitivity bias, we would have needed items that measure perceived sensitiveness directly. In the current study, we did not have items measuring personal experiences regarding sensitivity. Furthermore, we had to rely on assumptions derived from preceding research, especially regarding anonymity in different response modes. Given this, we cannot be certain that anonymity experiences were indeed weaker in online surveys. As noted earlier, we did not have the necessary information on respondents’ ICT-skills or all relevant attitudes influencing the choice of response mode. In this respect, it would be necessary to conduct a more in-depth study that takes into account the impact of information security perceptions behind the choice of response mode.