Abstract
In Brazil, power generating, transmitting and distributing companies operating in the regulated market are paid for their equipment availability. In case of system unavailability, the companies are financially penalized, more severely, on unplanned interruptions. This work presents a domain driven data mining approach for estimating the risk of systems’ unavailability based on their component equipments historical data, within one of the biggest Brazilian electric sector companies. Traditional statistical estimators are combined with the concepts of Recency, Frequency and Impact (RFI) for producing variables containing behavioral information finely tuned to the application domain. The unavailability costs are embedded in the problem modeling strategy. Logistic regression models bagged via their median score achieved Max_KS=0.341 and AUC_ROC=0.699 on the out-of-time data sample. This performance is much higher than the previous approaches attempted within the company. The system has been put in operation and will be monitored for the performance re-assessment and maintenance re-planning.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
References
ANEEL. Normative Resolution no. 270 (June 2007), http://www.aneel.gov.br/cedoc/ren2007270.pdf
CHESF. Companhia Hidro Elétrica do São Franscisco, http://www.chesf.gov.br/acompanhia_visaomissao.shtml
West, D.: Neural network credit scoring models. Computers and Operations Research 27, 1131–1152 (2000)
Jiang, T., Tuzhilin, A.: Improving Personalization Solutions through Optimal Segmentation of Customer Bases. IEEE Trans. Knowledge and Data Eng. 3(21), 1–16 (2009)
Cao, L.: Introduction to Domain Driven Data Mining. In: Cao, L., et al. (eds.) Data Mining for Business Applications, pp. 3–10 (2008)
Han, J., Kamber, M.: Data Mining: Concepts and techniques. Morgan Kaufmann, San Francisco (2006)
Conover, W.J.: Practical Nonparametric Statistics, 3rd edn. John Wiley & Sons, NY (1999)
Provost, F., Fawcett, T.: Robust Classification for Imprecise Environments. J. Machine Learning 3(42), 203–231 (2001)
Adya, M., Collopy, F.: How Effective are Neural Networks at Forecasting and Prediction? A Review and Evaluation. J. of Forecasting 17, 48–495 (1998)
Hilbe, J.M.: Logistic Regression Models. Chapman & Hall / CRC Press (2009)
Breiman, L.: Bagging predictors. Machine Learning 2(24), 123–140 (1996)
Adeodato, P.J.L., Vasconcelos, G.C., Arnaud, A.L., Cunha, R.C.L.V., Monteiro, D.S.M., Oliveira Neto, R.: The Power of Sampling and Stacking for the PAKDD-2007 Cross-Selling Problem. Int. J. of Data Ware-housing and Mining (IJDWM) 4, 22–31 (2008)
Adeodato, P.J.L., Vasconcelos, G.C., Arnaud, A.L., Cunha, R.C.L.V., Monteiro, D.S.M.P.: MLP ensembles improve long term prediction accuracy over single networks. Int. J. of Forecasting (2010) (to appear)
Jain, R.: The Art of Computer Systems Performance Analysis Techniques for Experimental Design Measurements Simulation and Modeling. John Wiley & Sons, New York (1991)
Kutner, M., Nachtsheim, C., Neter, J.: Applied Linear Regression Models, 4th edn. McGraw-Hill / Irwin (2004)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2010 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Adeodato, P.J.L. et al. (2010). Domain Driven Data Mining for Unavailability Estimation of Electrical Power Grids. In: García-Pedrajas, N., Herrera, F., Fyfe, C., Benítez, J.M., Ali, M. (eds) Trends in Applied Intelligent Systems. IEA/AIE 2010. Lecture Notes in Computer Science(), vol 6097. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-13025-0_38
Download citation
DOI: https://doi.org/10.1007/978-3-642-13025-0_38
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-13024-3
Online ISBN: 978-3-642-13025-0
eBook Packages: Computer ScienceComputer Science (R0)