Bipeline: A Web-Based Visualization Tool for Biclustering of Multivariate Time Series

Cachucho, Ricardo; Liu, Kaihua; Nijssen, Siegfried; Knobbe, Arno

doi:10.1007/978-3-319-46131-1_3

Ricardo Cachucho²⁰,
Kaihua Liu²⁰,
Siegfried Nijssen²⁰ &
…
Arno Knobbe²⁰

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 9853))

Included in the following conference series:

Joint European Conference on Machine Learning and Knowledge Discovery in Databases

2931 Accesses
3 Citations

Abstract

Large amounts of multivariate time series data are being generated every day. Understanding this data and finding patterns in it is a contemporary task. To find prominent patterns present in multivariate time series, one can use biclustering, that is looking for patterns both in subsets of variables that show coherent behavior and in a number of time periods. For this, an experimental tool is needed.

Here, we present Bipeline, a web-based visualization tool that provides both experts and non-experts with a pipeline for experimenting with multivariate time series biclustering. With Bipeline, it is straightforward to save experiments and try different biclustering algorithms, enabling users to intuitively go from pre-processing to visual analysis of biclusters.

You have full access to this open access chapter, Download conference paper PDF

Interactive Visualization of Multivariate Time Series Data

Torwards Visual Analytics for the Exploration of Large Sets of Time Series

TreeRoses: outlier-centric monitoring and analysis of periodic time series data

Article 10 August 2019

Keywords

These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.

1 Introduction

The development of sensor networks has resulted in an explosion of time series data over the last years. These are large multivariate time series, where variables are collected synchronously over time. Thus, pattern mining of multivariate time series is becoming highly relevant, both in scientific research and industrial applications. Note that in the multivariate setting, not only patterns in one variable over time are relevant, but also relationships between multiple variables could provide useful insights. This task can be seen as clustering both time periods and variables, also know as biclustering [8–10].

Given a multivariate time series, it could be useful to try different biclustering algorithms. Also, one needs to optimize parameters across different steps, such as pre-processing, segmentation and biclustering itself. For each of these steps, there are many parameters to be optimized, leading to a large number of experiments. Furthermore, at each step, visual inspection is highly important for researchers to validate their findings. However, there is a lack of tools for this process.

We propose Bipeline, a web-based visualization tool that provides a pipeline for applying biclustering to multivariate time series. This tool is readily accessible to anyone via a web-based interface, allowing them to navigate through multiple experimental settings. Parameters can be interactively tuned, with web components such as checkboxes, sliders and drop-down menus. At each step of the biclustering process, feedback is provided be means of visualizations, with plots such as pre-processed time series, segmentation boundaries and biclusters. One or more biclusters can be plotted with a simple selection procedure.

2 Related Work

Until now, biclustering software tools with a graphical user interface have been developed to deal with biological gene expression data. BicOverlapper [1] is a tool for visual inspection of gene expression biclusters, introducing a novel visualization algorithm Overlapper to represent biclusters. Similarly, BiCluster Viewer [2] is a visualization tool for efficient and interactive analysis of large gene expression datasets. BicAT [3] implements multiple biclustering algorithms, for visualization and analysis of biclusters for expression data. BiGGEsTS [4] provides an environment for biclustering time series gene expression data.

All tools mentioned above integrate techniques for pre-processing and biclustering analysis, specifically for gene expression data. Their main purpose is to support biologists with the analysis and exploration of the gene expression data. However, these tools do not support biclustering analysis for multivariate time series. Also, most of them do not provide a pipeline experiment environment. Bipeline provides such a pipeline, where intermediate results can be inspected and saved. Using a friendly and interactive plotting environment, both non-experts and experts can pre-process, segment and analyze biclusters for multivariate time series.

3 Tool Overview

Bipeline is a web-based application that provides a pipeline to pre-process, segment and bicluster multivariate time series. An online version is available [12], which is compatible with all modern web browsers and across different client platforms. Both the user interface in the web browser and the server are implemented using R Shiny package [5]. In Fig. 1, the system architecture illustrates the experimental pipeline and how each individual step relates to the other steps:

Importing: Users can upload datasets and have a first view of the data table and descriptive statistics (minimum, maximum, mean, ...). This first inspection, although useful, is not enough to assess the quality of the data.

Plotting: To gain further insight into the time series, it is crucial to have a visual inspection of the time series. The plotting panel includes multiple interactive plotting views, using a plotting R package dygraphs [6]. An example of these plots is illustrated in Fig. 2(a). These interactive plots allow zoom in and out functionality, which is a highly desirable functionality for visual inspection of large time series.

Pre-processing: This panel allows preliminary handling of data such as: excluding variables, normalization, conditional removal and replacement of data, and outlier removal. Users can alternate between plotting (Fig. 2(a)), and pre-processing (Fig. 2(b)) until satisfied, then export the pre-processed data by clicking the Save button.

Segmentation: This allows segmentation of the data, one of the steps necessary for the biclustering as suggested by [10]. By default, all variables share the same parameter settings: window size, overlap and threshold can be easily tuned. For greater flexibility, the user can dynamically create new tabs to set the parameters for individual variables. Additionally, a minimum segment size is customizable, and the tool will merge short segments to its most similar contiguous segment. Segmentation results can be visualized (Fig. 2(c)), saved and (re-)loaded, allowing the results to be used during the next step, biclustering.

Biclustering: In Bipeline, we implement a number of biclustering algorithms, group in three categories. The baseline algorithms allow users to try well-known biclustering algorithms (e.g., Cheng & Church) [8, 9], that have been implemented using R package biclust [7]. Segmentation + Baseline biclusters the time series using an average representation of each segment, instead of using individual rows. Segmentation + BiclusTS is a novel algorithm [10] introduced to recognize similarities between segments, using probability density-difference estimation [11]. All biclusters are plotted in colored blocks, as shown in Fig. 2(d). Users can select the biclusters they want to see, and the plot will respond with a real-time update.

Multiple features are shared by both Segmentation and Biclustering. Plots and parameter tables from different experiments are kept in history, allowing users to navigate back and forth to compare results and optimize parameters. During computationally expensive tasks, the front-end displays a progress bar, while the back-end server is busy carrying out the calculations. Furthermore, interactive web components can be saved into images with a single click.

4 Conclusion

We propose Bipeline, a web-based visualization tool, which provides a pipeline for applying biclustering to multivariate time series. Its main features include: visual inspection at multiple stages, interactive zoom in and out plotting, easy navigation, storage of results, and saving plots and experimental settings using a single click. Bipeline’s intuitive web-based design, makes it accessible both to experts and non-experts, and compatible across platforms.

References

Santamara, R., Thern, R., Quintales, L.: BicOverlapper: a tool for bicluster visualization. J. Bioinform. 24(9), 1212–1213 (2008)
Article Google Scholar
Heinrich, J., Seifert, R., Burch, M., Weiskopf, D.: BiCluster viewer: a visualization tool for analyzing gene expression data. In: Bebis, G., et al. (eds.) ISVC 2011, Part I. LNCS, vol. 6938, pp. 641–652. Springer, Heidelberg (2011)
Chapter Google Scholar
Barkow, S., Bleuler, S., Prelic, A., Zimmermann, P., Zitzler, E.: BicAT: a biclustering analysis toolbox. J. Bioinform. 22(10), 1282–1283 (2006)
Article Google Scholar
Gonalves, J., Madeira, S., Oliveira, A.: BiGGEsTS: integrated environment for biclustering analysis of time series gene expression data. J. BMC, 1–11 (2009)
Google Scholar
Chang, W., Cheng, J., Allaire, J., Xie, Y., McPherson, J.: Shiny: Web Application Framework for R. R package version 0.13.1 (2016)
Google Scholar
Vanderkam, D., Allaire, J., Owen, J., Gromer, D., Shevtsov, P., Thieurmel, B.: Dygraphs: Interface to ‘Dygraphs’ Interactive Time Series Charting Library. R package (2016)
Google Scholar
Kaiser, S., Santamaria, R., Khamiakova, T., Sill, M., Theron, R., Quintales, L., Leisch, F., DeTroyer, E.: Biclust: BiCluster Algorithms. R package version 1.2.0 (2015)
Google Scholar
Cheng, Y., Church, G.: Biclustering of expression data. In: Proceedings of the Eighth International Conference on Intelligent Systems for Molecular Biology, pp. 93–103 (2000)
Google Scholar
Madeira, S., Oliveira, A.: Biclustering algorithms for biological data analysis: a survey. J. IEEE/ACM Trans. Comput. Biol. Bioinform. 1, 24–45 (2004)
Article Google Scholar
Cachucho, R., Nijssen, S., Liu, K., Knobbe, A.: Bipeline: a web-based visualization tool for biclustering of multivariate time series. In: Berendt, B., Bringmann, B., Fromont, E. (eds.) ECML PKDD 2016, Part III. LNCS(LNAI), vol. 9853. pp. 12–16. Springer, Heidelberg (2016)
Google Scholar
Sugiyama, M., Kanamori, T., Suzuki, T., Plessis, M., Liu, S., Takeuchi, I.: Density-difference estimation. In: Proceedings of NIPS, pp. 683–691 (2012)
Google Scholar
http://fr.liacs.nl:7000

Download references

Author information

Authors and Affiliations

LIACS, Leiden University, Leiden, The Netherlands
Ricardo Cachucho, Kaihua Liu, Siegfried Nijssen & Arno Knobbe

Authors

Ricardo Cachucho
View author publications
You can also search for this author in PubMed Google Scholar
Kaihua Liu
View author publications
You can also search for this author in PubMed Google Scholar
Siegfried Nijssen
View author publications
You can also search for this author in PubMed Google Scholar
Arno Knobbe
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Ricardo Cachucho .

Editor information

Editors and Affiliations

Department of Computer Science, KU Leuven, Leuven, Belgium
Bettina Berendt
Deloitte GmbH, München, Germany
Björn Bringmann
Laboratoire Hubert Curien, Jean Monnet University, Saint-Etienne, France
Élisa Fromont
Allianz SE, Munich, Germany
Gemma Garriga
Max-Planck-Institute for Informatics, Saarbrücken, Germany
Pauli Miettinen
Aalto University School of Science, Espoo, Finland
Nikolaj Tatti
Siemens AG & Lud. Max. Univ. of Munich, Munich, Germany
Volker Tresp

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Cachucho, R., Liu, K., Nijssen, S., Knobbe, A. (2016). Bipeline: A Web-Based Visualization Tool for Biclustering of Multivariate Time Series. In: Berendt, B., et al. Machine Learning and Knowledge Discovery in Databases. ECML PKDD 2016. Lecture Notes in Computer Science(), vol 9853. Springer, Cham. https://doi.org/10.1007/978-3-319-46131-1_3

Download citation

DOI: https://doi.org/10.1007/978-3-319-46131-1_3
Published: 03 September 2016
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-46130-4
Online ISBN: 978-3-319-46131-1
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics

Bipeline: A Web-Based Visualization Tool for Biclustering of Multivariate Time Series

Abstract

Similar content being viewed by others

Interactive Visualization of Multivariate Time Series Data

Torwards Visual Analytics for the Exploration of Large Sets of Time Series

TreeRoses: outlier-centric monitoring and analysis of periodic time series data

Keywords

1 Introduction

2 Related Work

3 Tool Overview

4 Conclusion

References

Author information

Authors and Affiliations

Corresponding author

Editor information

Editors and Affiliations

Rights and permissions

Copyright information

About this paper

Cite this paper

Download citation

Publish with us

Navigation

Bipeline: A Web-Based Visualization Tool for Biclustering of Multivariate Time Series

Abstract

Similar content being viewed by others

Interactive Visualization of Multivariate Time Series Data

Torwards Visual Analytics for the Exploration of Large Sets of Time Series

TreeRoses: outlier-centric monitoring and analysis of periodic time series data

Keywords

1 Introduction

2 Related Work

3 Tool Overview

4 Conclusion

References

Author information

Authors and Affiliations

Corresponding author

Editor information

Editors and Affiliations

Rights and permissions

Copyright information

About this paper

Cite this paper

Download citation

Share this paper

Publish with us

Search

Navigation