Skip to main content
Log in

Extension of multivariate regression trees to interval data. Application to electricity load profiling

  • Published:
Computational Statistics Aims and scope Submit manuscript

Summary

Several data can be presented as interval curves where intervals reflect a within variability. In particular, this representation is well adapted for load profiles, which depict the electricity consumption of a class of customers. Electricity load profiling consists in assigning a daily load curve to a customer based on their characteristics such as energy requirement. Within the load profiling scope, this paper investigates the extension of multivariate regression trees to the case of interval dependent (or response) variables. The tree method aims at setting up simultaneously load profiles and their assignment rules based on independent variables. The extension of multivariate regression trees to interval responses is detailed and a global approach is defined. It consists in a first stage of a dimension reduction of the interval response variables. Thereafter, the extension of the tree method is applied to the first principal interval components. Outputs are the classes of the interval curves where each class is characterized both by an interval load profile (e.g. the class prototype) and an assignment rule based on the independent variables.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Figure 1:
Figure 2:
Figure 3:
Figure 4.1
Figure 4.2
Figure 4.3
Figure 5:
Figure 6:

Similar content being viewed by others

References

  • Bailey, J. (2000), Load Profiling for Retail Choice: Examining a Complex and Crucial Component of Settlement. The Electricity Journal, 13, 10, 69–74.

    Article  Google Scholar 

  • Billard, L. and Diday, E. (2003), From the statistics of Data to the Statistics of Knowledge: Symbolic Data Analysis. Journal of the American Statistical Association, 98, 462, 470–487.

    Article  MathSciNet  Google Scholar 

  • Bock, H.-H. and Diday, E. (eds.) (2000), Analysis of Symbolic Data. Exploratory methods for extracting statistical information from complex data. Studies in classification, data analysis and knowledge organization., Springer Verlag, Heidelberg.

    MATH  Google Scholar 

  • Breiman, L., Friedman, J.H., Olshen, R.A. and Stone, C.J. (1984), Classification And Regression Trees. Belmont, CA: Wadsworth.

    MATH  Google Scholar 

  • Cazes, P., Chouakria, A., Diday, E. and Schektman, Y. (1997), Extension de l’analyse en composantes principales à des données de type intervalle. Revue de Statistique Appliquée XIV, 3, 5–24.

    Google Scholar 

  • Chavent, M. and Lechevallier, Y. (2002), Dynamical clustering of interval data: optimization of an adequacy criterion based on Hausdorff distance. In K. Jajuga, A. Sokolowski and H.-H. Bock eds, Classification, Clustering and Data Analysis: Proceedings of the 8th Conference of the International Federation of Classification Societies, IFCS-2002, Springer Verlag, Berlin, 53–60.

    Chapter  Google Scholar 

  • Chouakria, A. (1998), Extension des méthodes d’analyse factorielle à des données de type intervalle. PhD Thesis, University of ParisIX-Dauphine, France.

    Google Scholar 

  • De’ath, G. (2002), Multivariate regression trees: a new technique for modeling species-environment relationships. Ecology, 83, 1105–1117.

    Google Scholar 

  • De Carvalho, F., De Souza, R., Chavent, M., Lechevallier, Y., (2005), Adaptive Hausdorff distances and dynamic clustering of symbolic interval data. Pattern Recognition Letters, In Press.

  • De Souza, R. and De Carvalho, F. (2004), Clustering of interval data based on city-block distances. Pattern Recognition Letters, 25, 3, 353–365.

    Article  Google Scholar 

  • Diday, E. (2002), An introduction to Symbolic Data Analysis and the Sodas software. Journal of Symbolic Data Analysis, 0, 0, international electronic journal.

  • Diday, E., Noirhomme-Fraiture, M. (eds.) (2006), Symbolic Data Analysis and the SODAS Software, Wiley, To appear.

  • Figueiredo, V., Rodrigues, F., Vale, Z. and Gouveia, J.B. (2005), An electric energy consumer characterization framework based on data mining techniques. IEEE Transactions on Power Systems, 20, 2, 596–602.

    Article  Google Scholar 

  • Larsen, D.R. and Speckman, P.L. (2004), Multivariate regression trees for analysis of abundance data. Biometrics, 60, 543–549.

    Article  MathSciNet  Google Scholar 

  • Lauro, C. and Palumbo, F. (2000), Principal Component Analysis of Interval Data: A Symbolic Data Analysis Approach. Computational Statistics, 15, 1, 73–87.

    Article  Google Scholar 

  • Limam, M., Diday, E. and Winsberg, S. (2003), Symbolic Class Description with Interval Data. Journal of Symbolic Data Analysis, 1, 1, international electronic journal.

    Google Scholar 

  • Morgan, J.N. and Sonquist, J.A. (1963), Problems in the Analysis of Survey Data, and a Proposal. Journal of the American Statistical Association, 58, 415–435.

    Article  Google Scholar 

  • Palumbo, F. and Irpino A. (2005), Multidimensional Interval-Data: Metrics and Factorial Analysis. In International Symposium on Applied Stochastic Models and Data Analysis, ASMDA 2005, Brest, 756–763.

  • Segal, M.R. (1992), Tree-structured methods for longitudinal data. Journal of the American Statistical Association, 87, 418, 407–418.

    Article  Google Scholar 

  • Stéphan, V. (2005), Courbo Tree: Application des arbres multivariés pour le Load Profiling. Revue Modulad, 33, electronic journal, 129–138.

    Google Scholar 

  • Torgo, L. (1999), Inductive Learning of Tree-based Regression Models. PhD. Thesis, University of Porto, Portugal.

    Google Scholar 

  • Yu, Y. and Lambert, D. (1999), Fitting trees to functional data, with an application to time of day patterns. Journal of Computational and Graphical Statistics, 8, 749–762.

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Rights and permissions

Reprints and permissions

About this article

Cite this article

Cariou, V. Extension of multivariate regression trees to interval data. Application to electricity load profiling. Computational Statistics 21, 325–341 (2006). https://doi.org/10.1007/s00180-006-0266-7

Download citation

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s00180-006-0266-7

Keywords

Navigation