Keywords

1 Introduction

Users need an interface to access e-Science resources usually data and computation power. It is essential to know how the e-infrastructures are really used and who the users are. While there can be thousands of users of an e-infrastructure, the users who use e-infrastructure extensively can be very few [1]. The interface to access e-Science resources includes command line tools, web portals and User Interfaces (UIs) to access data assets; hosted by an e-infrastructure. These interfaces provided by an e-infrastructure are the key gateways to perform e-Science operations such as: Creating data, collecting, storing data, sharing data, publishing data, searching data sets, visualizing data and processing data. The UIs of e-infrastructure are designed, developed by the developers and designers.

This study explores the interaction and the User eXperience (UX) of users of an e-Science infrastructure: Earth System Grid Federation (ESGF) [2, 3]. ESGF is a global climate e-Science infrastructure that offers climate data projects, computing and visualization facilities to climate scientists. ESGF is currently also being extended to serve other domains such as biology, chemistry and astronomy. The UI of e-Science infrastructures in general have not been adequately researched [4] and there is hardly any study that points at the UX of e-Science and especially in the domain of climate science. These UIs of e-Science infrastructures, however, play a central role for the success of e-Science as e-Science is considered a new paradigm in doing research and helps to fulfill the Science 2.0 vision. Science 2.0 is a term under which more collaboration amongst scientists using technology, especially Web 2.0 technologies is expected; as opposed to traditional laboratory science which is termed as Science 1.0.

This study shows that the end-users of ESGF experience problems with the UIs and HCI components to get the needed data and information via web documentation. Therefore, they send requests to the user support centre or help desk in a hope to get their problems solved. Consequently, offering a better UX and establishing an efficient support in a wider scope are one of the major challenges to be dealt with to make e-Science an established significant scientific method in the highly digitalized and open data society of the 21st Century. This paper evaluates the UX and usability of UI of the applications offered by ESGF e-Science infrastructure and highlights the parts of infrastructure and applications that need further enhancement.

This paper is structured as follows: In the Sect. 2 the background of the context and terminologies of e-Science, UI and UX is provided. Also in this section an overview of related work to the research question is given. Subsequently the research steps taken to generate recommendation are explained in Sect. 3. The results of this research study are then shown and explained in Sect. 4. Future work and conclusion are elaborated in Sect. 5 and Sect. 6, respectively.

2 Background and Related Work

Access to big data of scientific nature to enable e-research is conducted via e-Science infrastructures that are deployed to access and share the data, high performance computing (HPC) facilities and human resources to facilitate interdisciplinary and inter-disciplinary research to harvest knowledge [2, 3, 5,6,7]. Users need an interface to access its resources usually data. The interface includes command line tools, web portals and Graphical User Interface (GUI) to access data assets which are the main resources hosted [8]. However, during an interaction of a user with an e-Science infrastructure, a user may require help due to outages of some resources e.g. servers or any other anomaly and even difficult user interfaces. In other case: a user requires particular scientific or technical information [1]. In order to meet these user challenges, e-Science infrastructures also known as cyber-infrastructures (CI) or virtual research environments (VREs) offer UIs and user support in the form of a help-desk, which have not received adequate attention to include users’ point of view since the inception of cyber-infrastructures [9].

Nevertheless, the aspect of interaction with UI of an e-Science infrastructure is not limited to the end-users. Indeed, it has been observed that also other stakeholders, e.g. data publishers, data curators, need better UI of tools and applications to properly publish data and make it accessible the users of an e-Science infrastructure [3, 10,11,12]. Moreover, e-Science infrastructure is mostly a decentralized structure of multiple organizations as well as data centers worldwide and there are diverse users with diverse needs interested in doing e-research at multiple sites world-wide [5, 13]. The users and employees of research laboratories and data centers such as Lawrence Livermore National Laboratory (LLNL), the German Climate Computing Centre (DKRZ), British Atmospheric Data Centre (BADC), Jet Propulsion Laboratory (NASA-JPL) and other are generally scientists and they contribute towards publishing big data on one hand and on the other they use it [5, 10]. All these facts lead to the motivation to study the current UX, UI and other related areas in the domains of cyber-infrastructures.

In this study we take Earth System Grid Federation (ESGF) as a use case in the form of a single case study. ESGF is an important open data infrastructure in the field of climate science. ESGF facilitates to study climate change and impact of climate change on human society and Earth’s eco-system [14]. In the case study of ESGF, it was previously felt that the UIs and tools to support users including the user support process and information on the web documentation to guide users of a distributed, multi-organizational research-oriented, non-commercial, collaborative environment needs an over-haul [3]. Consequently, new UI called CoG was suggested and implemented. Moreover, suggestions were also made to improve the user support process and Federated e-Science User Support Enhancement (FeUSE) framework was suggested [3, 15]. Better the user experience (UX) of the tools offered by ESGF, better the GUI and more productive can be the e-research environment that can lead to the boom.

There are many books as well as articles that provide guidelines to design an effective Graphical User Interface (GUI) in order to enhance the user experience and the usability e.g. [16,17,18]. A study about the guidelines to provide reliable information for users, displayed on the UI is found in [19] and significance of line length for tablet PC users is found in [20]. However, it is not known that whether these guidelines have been applied to the UIs of e-Science infrastructures that serve big data to a wide variety of users. Systematic evaluation of e-infrastructure UIs are needed to be done. There is hardly any study that discussed the UX of e-Science UIs previously. Nonetheless, former studies relevant to UX in e-Science and other issues pertaining particularly to ESGF are: The study about the governance scheme of ESGF [6, 21], the user support process of ESGF was thoroughly examined [2, 3, 10, 11, 15], the evaluation of the user support unit i.e. helpdesk process [8], the user interface of the tools used by the help desk unit [22], the model tasks can be coordinated, prioritized and accomplished [20], the visualization challenges of ecosystems [7] and others. The state of the art on the current challenges in the field of open data and e-infrastructures are indicated by [14]. The problems with the web documentation for seeking relevant information on the web are also common. Moreover, apart from better usability, User experience (UX), the possibilities of UI customization, UI as well as software extension, and collaboration features amongst users of e-Science infrastructures are very important in choosing the suitable UI.

In the last decade, the UIs as well as user-support in ESGF has been evolving mainly due to the changes in the governance scheme and technology employed by ESGF cyber-infrastructure. For instance, looking at the history of ESGF development, due to the technological cum organizational changes and especially the introduction of new data projects served by the ESGF data archive system, the number of users and their new types of needs as well as requirements have been on the constant upsurge. The technology and its use can affect the business model, service orientation and the policies of organizations [23] and this is true for e-Science infrastructures as well. Consequently, up until now the employees of ESGF are designing and developing the UIs, on a free will basis. This survey questionnaire conducted revealed number of issues especially related to UX and UIs.

3 Research Methods

An online survey of data providers and consumers supported by ESGF was conducted in December 2016. The intent of the survey was to provide the ESGF community of developers with anonymous feedback about how ESGF can improve its core services and to ascertain what scientists believe is the greatest strengths and weaknesses of the ESGF enterprise. The Executive Committee (EC) distributed the survey via mailing lists associated with ESGF projects, resulting in a representative sampling of geographically and topically diverse responders. Descriptive results from the global survey attempt to shed light on the data needs of national and international projects. Action items generated from the survey results are intended to bridge the gap between short and long term development and operations.

For this survey, the request for feedback went out to: (1) several World Climate Research Programme (WCRP)-endorsed Model Intercomparison Projects (MIPs) including Coupled Model Intercomparison Project (CMIP), the Atmospheric Model Intercomparison Project (AMIP), and Input for Model Intercomparison Project (Input4MIPs); (2) Coordinated Regional Climate Downscaling Experiment (CORDEX); (3) the NASA-led Observations for Model Intercomparisons (Obs4MIPs); (4) the Accelerated Climate Modeling for Energy (ACME) project; and (5) the Collaborative RE-Analysis Technical Environment Intercomparison Project (CREATE-IP). Most questions asked researchers to rate their need for a specific support or service on a six point Likert scale i.e. scale from 1 to 6; 1 indicated little or no interest, while 6 indicated high interest or need. Other questions required yes or no responses. The survey also presented open-ended questions. Weighted average values were calculated for each question across all responses (e.g., a value of 4.54 for a particular topic would indicate that most participants for that question would rate the topic as being of high or very high interest). Also calculated was the percentage of participants that gave a topic a particular rank (e.g., 37% ranks as a very high response). Merging the weighted average with the percentage of responses gave yet another perspective on the value of the survey response (e.g., 1.49 constitutes a very high community interest, taking into consideration the combined weighted average and the percentage of participants).

4 Results and Discussion

Respondents were asked to identify themselves as a data provider, data consumer, or both. The survey also asked the respondents to best describe their profession (e.g., undergraduate or graduate student, postdoctoral scholar, academic scientist, governmental scientist, or other) and type of affiliation (e.g., governmental agency, university and the private sector). Results show that Linux was the most commonly used platform among the respondents, followed by Windows and Mac OSX.

The bulk of the survey consisted of 42 questions listed under several subcategories that asked respondents to rate the importance of the service or tool. These subcategories included:

  • User interface (UI) (websites, CoG)

  • Ingestion of and access to large volumes of scientific data (i.e., from data archive to supercomputer and server-side analysis)

  • Web documentation

  • Improved UI designs and principles to enable easier access to computer and software capabilities (e.g., recommendation systems, more flexible and interactive interfaces)

  • Distributed global search

  • Unified data discovery for all ESGF data sources to support research.

  • Quality control (QC) algorithms for data

  • Reliability and resilience of resource

  • Data access and usage

  • Remote computing capability

  • Data transport

  • QC issues

The first step in evaluating the responses was to list the subcategories in terms of need on a scale of 1 to 6:

  • 10 of the subcategories earned an average response rating of 4.1 or higher.

  • 17 earned between 4.1 and 3.7.

  • Remaining responses earned less than 3.6.

This spread indicates that ESGF users have diverse needs and priorities. Roughly 40% of responses with a combined weighted score of 1.49 indicated that the ESGF UI also known as CoG ( www.earthsystemcog.org ) was the most difficult feature to use and needs improvement. About 35% of responses, with a combined weighted score of 1.46, pointed to the need for sufficient access to large volumes of data with computational resources for server-side (i.e., remote) analysis and visualization. Also notable at a combined weighted score of 1.46 was the emphasis on better, more reliable online documentation. Related to these changes, respondents requested an environment that supports more effective collaboration and sharing within and between science teams (e.g., collaborative tools), at a combined weighted score of 1.11. Of relevance to efforts to design a more integrated data and computing infrastructure was the finding that most respondents access data and compute resources via web interfaces or remote login along with application programming interfaces (APIs).

The question identified as the area of greatest need overall was “How important is knowledge gathering, managing, and sharing?” All questions in this category were rated less than 4.06 but higher than 3.8; no other category had such a high average. The topics included:

  • Direct data delivery into ESGF computing systems from distributed data resources—3.99/27.7%.

  • Data sharing—3.96/27.99%.

  • Web documentation—3.89/37.61%.

  • Data publishing (long-tail publishing for individual scientists)—3.89/27.41%.

  • QC algorithms for data—3.86/32.07%.

  • Ancillary data products (e.g., data plots, statistical summaries)—3.84/30.32%.

A question raising significant interest among the survey participants was “How good are human-computer interactions?” Respondents identified collaborative environments, in particular, as a key requirement (3.63). The new ESGF mandate regarding data management and sharing clearly has penetrated the community and raises questions for many, as evident by high scores for several related survey topics:

  • Easy way to publish and archive data using one of the ESGF data centers—3.89/27.41%.

  • User support for data access and download—4.21/29.45%.

  • Access to enough computational and storage resources—3.89/28.96%.

5 Future Work

From the results of this study further research directions in future for the enterprises and big data infrastructures based on fog, cloud and grid computing can be guided by a significant research objective that is how users’ view can be incorporated in the design and development of the components, application, processes and interfaces of e-infrastructures. The survey conducted in this study was instrumental in providing users’ feedback to guide software developers to let them know that what they need. The detailed analysis report of the results of this survey is marked as a future work, which will be published in future. It is recommended by other e-Science initiatives to conduct users’ survey at regular intervals to gauge the usability and UX of e-infrastructures to provide better services to users.

Additionally, it is proposed to focus on encouraging and persuading users of big data infrastructures to participate in the design, development and evolution of the applications related to data infrastructures and the underlying processes in the future research. Furthermore, it is an interesting aspect to study that how the software designers and developers especially people who design UIs of design interfaces and other components of software are currently meeting users’ requirements in order to allow users to interact with applications without any trouble. And how it can be done better. The process of development of software and interfaces, whether related to e-Science or not, needs to incorporate user’s point of view in the form that the developers can address users’ UI requirements. Since there are different types of users in e-Science it is important to consider all groups of users [3]. Most of the stakeholders e.g. in e-Science and ICT infrastructures include data scientists, data curators, computer scientists, domain experts, managers and most importantly interface designers as well as software engineers, who can provide input to enable better usability to the users.

6 Conclusion

In this study users’ survey was applied to observe the user experience (UX) in a federated e-Science environment of the climate science domain. It was observed that the features regarding HCI, UI of ESGF applications and web documentation need further improvement in order to enhance user experience. In this direction, the documentation is needed to be updated regularly, made reliable and accessible to all types of users. Furthermore, ingestion of large volumes of scientific data and access to large volumes of scientific data (i.e., from data archive to supercomputer and server-side analysis) need to be improved. The respondents also indicated the need of an environment that supports more effective collaboration and sharing within and between science teams (e.g., collaborative tools). In essence, the concepts of service orientation and meeting user’s needs should be incorporated in the business models of big data enterprises particularly governmental e-Science facilities.