Abstract
The forward search is a powerful general method for detecting multiple masked outliers and for determining their effect on inferences about models fitted to data. From the monitoring of a series of statistics based on subsets of data of increasing size we obtain multiple views of any hidden structure. One of the problems of the forward search has always been the lack of an automatic link among the great variety of plots which are monitored. Usually it happens that a lot of interesting features emerge unexpectedly during the progression of the forward search only when a specific combination of forward plots is inspected at the same time. Thus, the analyst should be able to interact with the plots and redefine or refine the links among them. In the absence of dynamic linking and interaction tools, the analyst risks to miss relevant hidden information. In this paper we fill this gap and provide the user with a set of new robust graphical tools whose power will be demonstrated on several regression problems. Through the analysis of real and simulated data we give a series of examples where dynamic interaction with different “robust plots” is used to highlight the presence of groups of outliers and regression mixtures and appraise the effect that these hidden groups exert on the fitted model.
Similar content being viewed by others
References
Atkinson AC, Riani M (2000) Robust diagnostic regression analysis. Springer, New York
Atkinson AC, Riani M (2002) Forward search added variable t tests and the effect of masked outliers on model selection. Biometrika 89: 939–946
Atkinson AC, Riani M, Cerioli A (2004) Exploring multivariate data with the forward search. Springer, New York
Buja A, Cook D, Asimov D, Hurley C (2009) Theory of dynamic projections in high-dimensional data visualization. Electron J Stat
Chen C, Härdle W, Unwin A (eds) (2008) Handbook of data visualization, vol XIV of springer handbooks of computational statistics. Springer, Berlin
Friendly M (2005) Milestones in the history of data visualization: a case study in statistical historiography. In: Weihs C, Gaul W (eds) Classification: the ubiquitous challenge. Springer, New York, pp 34–52
Martinez WL, Martinez AR (2004) exploratory data analysis with MATLAB. Computer science and data analysis series. Chapman & Hall/CRC, London
Perrotta D, Torti F (2009) Detecting price outliers in European trade data with the forward search. In: Data analysis and classification: from exploration to confirmation, studies in classification, data analysis, and knowledge organization. Springer, Berlin (Forecoming)
Riani M, Atkinson AC (2007) Fast calibrations of the forward search for testing multiple outliers in regression. Adv Data Anal Classif 1: 123–141. doi:10.1007/s11634-007-0007-y
Riani M, Atkinson AC, Cerioli A (2009) Finding an unknown number of multivariate outliers. J Royal Stat Soc Ser B 71: 201–221
Riani M, Cerioli A, Atkinson A, Perrotta D, Torti F (2008) Fitting mixtures of regression lines with the forward search. In: Fogelman-Soulie F, Perrotta D, Piskorski J, Steinberger R (eds) Mining massive data sets for security. IOS Press, Amsterdam, pp 271–286
Rousseeuw PJ (1984) Least median of squares regression. J Am Stat Assoc 79: 871–880
Spence R (2001) Information visualization. Addison Wesley, California
Tufte ER (1983) The visual display of quantitative information. Graphics Press, Cheshire
Wilhelm A (2008) Linked views for visual exploration, vol XIV. Chen, Härdle, and Unwin, pp 199–215
Author information
Authors and Affiliations
Corresponding author
Rights and permissions
About this article
Cite this article
Perrotta, D., Riani, M. & Torti, F. New robust dynamic plots for regression mixture detection. Adv Data Anal Classif 3, 263–279 (2009). https://doi.org/10.1007/s11634-009-0050-y
Received:
Revised:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s11634-009-0050-y
Keywords
- Forward search
- Robustness
- Exploratory data analysis
- Data visualization
- Statistical graphics
- Brushing and linking