Abstract
In data envelopment analysis (DEA) it is usually necessary to perform some data preprocessing routines. For example, in many practical situations, it may occur that some of the input and/or output values are not available for all the decision-making units (DMUs). Therefore, in such situations, it becomes necessary to set up a strategy to deal with the missing data. In this context, the present work proposes the application of a recent matrix approximation approach, known as low-rank matrix completion, for preprocessing missing data in DEA. The proposed method is evaluated through a number of numerical experiments carried out on both synthetic and actual data. We compare, for a wide range of missing data proportions, the efficiencies of DMUs obtained after recovering the missing entries to those obtained in an ideal situation, in which all data is known. We also provide comparisons with other approaches that deal with missing data in the context of DEA. The results attest the viability of the application of the proposed low-rank matrix completion strategy to DEA.
Similar content being viewed by others
References
ACM SIGKDD and Netflix. (2007). In Proceedings of kdd cup and workshop. San Jose, CA, USA. http://www.cs.uic.edu/~liub/KDD-cup-2007/proceedings.html.
Adler, N., Friedman, L., & Sinuany-Stern, Z. (2002). Review of ranking methods in the data envelopment analysis context. European Journal of Operational Research, 140(2), 249–265.
Adler, N., & Golany, B. (2007). PCA-DEA (pp. 139–153). Boston, MA: Springer.
Adler, N., & Yazhemsky, E. (2010). Improving discrimination in data envelopment analysis: Pca-dea or variable reduction. European Journal of Operational Research, 202(1), 273–284.
Angulo-Meza, L., & Lins, M. P. E. (2002). Review of methods for increasing discrimination in data envelopment analysis. Annals of Operations Research, 116(1), 225–242.
Azizi, H. (2013). A note on data envelopment analysis with missing values: An interval dea approach. The International Journal of Advanced Manufacturing Technology, 66(9), 1817–1823.
Banker, R. D., Charnes, A., & Cooper, W. W. (1984). Some models for estimating technical and scale inefficiencies in data envelopment analysis. Management Science, 30(9), 1078–1092.
Cai, J.-F., Candès, E. J., & Shen, Z. (2010). A singular value thresholding algorithm for matrix completion. SIAM Journal on Optimization, 20(4), 1956–1982.
Candès, E. J., & Recht, B. (2009). Exact matrix completion via convex optimization. Foundations of Computational Mathematics, 9(6), 717–772.
Charnes, A., Cooper, W. W., Lewin, A. Y., & Seiford, L. (1995). Data envelopment analysis: Theory, methodology, and application (1st ed.). Norwell, MA: Kluwer Academic Publishers.
Charnes, A., Cooper, W. W., & Rhodes, E. (1978). Measuring the efficiency of decision making units. European Journal of Operational Research, 2(6), 429–444.
Charnes, A., Cooper, W. W., Seiford, L., & Stutz, J. (1982). A multiplicative model for efficiency analysis. Socio-Economic Planning Sciences, 16(5), 223–224.
Chistov, A. L., & Grigor’ev, D. Y. (1984). Complexity of quantifier elimination in the theory of algebraically closed fields (pp. 17–31). Berlin: Springer.
Cobb, C. W., & Douglas, P. H. (1928). A theory of production. The American Economic Review, 18(1), 139–165.
Cooper, W. W., Thompson, R. G., & Thrall, R. M. (1996). Chapter 1 introduction: Extensions and new developments in dea. Annals of Operations Research, 66(1), 1–45.
Costantino, N., Dotoli, M., Epicoco, N., Falagario, M., & Sciancalepore, F. (2012). A novel fuzzy data envelopment analysis methodology for performance evaluation in a two-stage supply chain. In 2012 IEEE International Conference on Automation Science and Engineering (CASE) (pp. 974–979).
Dokmanic, I., Parhizkar, R., Ranieri, J., & Vetterli, M. (2015). Euclidean distance matrices: Essential theory, algorithms, and applications. IEEE Signal Processing Magazine, 32(6), 12–30.
Dotoli, M., Epicoco, N., Falagario, M., & Sciancalepore, F. (2015). A cross-efficiency fuzzy data envelopment analysis technique for performance evaluation of decision making units under uncertainty. Computers & Industrial Engineering, 79, 103–114.
Dotoli, M., Epicoco, N., Falagario, M., & Sciancalepore, F. (2016). A stochastic cross-efficiency data envelopment analysis approach for supplier selection under uncertainty. International Transactions in Operational Research, 23(4), 725–748.
Dyson, R. G., Allen, R., Camanho, A. S., Podinovski, V. V., Sarrico, C. S., & Shale, E. A. (2001). Pitfalls and protocols in DEA. European Journal of Operational Research, 132(2), 245–259.
Hastie, T., Tibshirani, R., & Wainwright, M. (2015). Statistical learning with sparsity: The Lasso and generalizations. Boca Raton: CRC Press.
Hatami-Marbini, A., Emrouznejad, A., & Tavana, M. (2011). A taxonomy and review of the fuzzy data envelopment analysis literature: Two decades in the making. European Journal of Operational Research, 214(3), 457–472.
Ho, C.-T. B., & Wu, D. D. (2009). Online banking performance evaluation using data envelopment analysis and principal component analysis. Computers and Operations Research, 36(6), 1835–1842.
Kao, C., & Liu, S.-T. (2000). Data envelopment analysis with missing data: An application to university libraries in Taiwan. Journal of the Operational Research Society, 51(8), 897–905.
Kao, L.-J., Chi-Jie, L., & Chiu, C.-C. (2011). Efficiency measurement using independent component analysis and data envelopment analysis. European Journal of Operational Research, 210(2), 310–317.
Kendall, M. G. (1938). A new measure of rank correlation. Biometrika, 30(1/2), 81–93.
Liu, J. S., Lu, L. Y. Y., & Lu, W.-M. (2016). Research fronts in data envelopment analysis. Omega, 58, 33–45.
Liu, Z., & Vandenberghe, L. (2010). Interior-point method for nuclear norm approximation with application to system identification. SIAM Journal on Matrix Analysis and Applications, 31(3), 1235–1256.
Olesen, O. B., & Petersen, N. C. (2016). Stochastic data envelopment analysis review. European Journal of Operational Research, 251(1), 2–21.
Rennie, J. D.M., & Srebro, N. (2005). Fast maximum margin matrix factorization for collaborative prediction. In Proceedings of the 22nd international conference on machine learning, ICML ’05 (pp. 713–719). New York, NY: ACM.
Schafer, J. L., & Graham, J. W. (2002). Missing data: our view of the state of the art. Psychological methods, 7(2), 147.
Smirlis, Y. G., Maragos, E. K., & Despotis, D. K. (2006). Data envelopment analysis with missing values: An interval dea approach. Applied Mathematics and Computation, 177(1), 1–10.
So, A. M.-C., & Ye, Y. (2007). Theory of semidefinite programming for sensor network localization. Mathematical Programming, 109(2), 367–384.
Theodoridis, S. (2015). Machine learning: A Bayesian and optimization perspective. Cambridge: Academic Press.
Author information
Authors and Affiliations
Corresponding author
Additional information
L. T. Duarte would like to thank the São Paulo Research Foundation (FAPESP) (Process 2015/16325-1) and the National Council for Scientific and Technological (Process 311786/2014-6) for funding his research.
Rights and permissions
About this article
Cite this article
Duarte, L.T., Mussio, A.P. & Torezzan, C. Dealing with missing information in data envelopment analysis by means of low-rank matrix completion. Ann Oper Res 286, 719–732 (2020). https://doi.org/10.1007/s10479-018-2885-0
Published:
Issue Date:
DOI: https://doi.org/10.1007/s10479-018-2885-0