Skip to main content
Log in

Quantile–DEA classifiers with interval data

  • Published:
Annals of Operations Research Aims and scope Submit manuscript

    We’re sorry, something doesn't seem to be working properly.

    Please try refreshing the page. If that doesn't work, please contact support so we can address the problem.

Abstract

This research intends to develop the classifiers for dealing with binary classification problems with interval data whose difficulty to be tackled has been well recognized, regardless of the field. The proposed classifiers involve using the ideas and techniques of both quantiles and data envelopment analysis (DEA), and are thus referred to as quantile–DEA classifiers. That is, the classifiers first use the concept of quantiles to generate a desired number of exact-data sets from a training-data set comprising interval data. Then, the classifiers adopt the concept and technique of an intersection-form production possibility set in the DEA framework to construct acceptance domains with each corresponding to an exact-data set and thus a quantile. Here, an intersection-form acceptance domain is actually represented by a linear inequality system, which enables the quantile–DEA classifiers to efficiently discover the groups to which large volumes of data belong. In addition, the quantile feature enables the proposed classifiers not only to help reveal patterns, but also to tell the user the value or significance of these patterns.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8
Fig. 9
Fig. 10
Fig. 11
Fig. 12

Similar content being viewed by others

References

  • Charnes, A., Cooper, W. W., & Rhodes, E. (1978). Measuring the efficiency of decision making units. European Journal of Operational Research, 2, 429–444.

    Article  Google Scholar 

  • Charnes, A., Cooper, W. W., Wei, Q. L., & Huang, Z. M. (1989). Cone ratio data envelopment analysis and multiobjective programming. International Journal of Systems Science, 20(7), 1099–1118.

    Article  Google Scholar 

  • Cooper, W. W., Park, K. S., & Yu, G. (1999). IDEA and AR-REA: Models for dealing with imprecise data in DEA. Management Science, 45, 597–607.

    Article  Google Scholar 

  • Cooper, W. W., Seiford, L. M., & Tone, K. (2006). Introduction to data envelopment analysis and its uses: With DEA-solver software and references. New York: Springer.

    Google Scholar 

  • Corne, D., Dhaenens, C., & Jourdan, L. (2012). Synergies between operations research and data mining: The emerging use of multi-objective approaches. European Journal of Operational Research, 221, 469–479.

    Article  Google Scholar 

  • Despotis, D. K., & Smirlis, Y. G. (2002). Data envelopment analysis with imprecise data. European Journal of Operational Research, 140, 24–36.

    Article  Google Scholar 

  • Han, J., & Kamber, M. (2007). Data mining: Concepts and techniques. San Francisco: Morgan Kaufman Publishers.

    Google Scholar 

  • Kao, C. (2006). Interval efficiency measures in data envelopment analysis with imprecise data. European Journal of Operational Research, 174, 1087–1099.

    Article  Google Scholar 

  • Pendharkar, P. C. (2002). A potential use of DEA for inverse classification problem. Omega: An International Journal of Management Science, 30, 243–248.

    Article  Google Scholar 

  • Pendharkar, P. C. (2011). A hybrid radial basis function and data envelopment analysis neural network for classification. Computers and Operations Research, 38, 256–266.

    Article  Google Scholar 

  • Pendharkar, P. C. (2012). Fuzzy classification using the data envelopment analysis. Knowledge Based Systems, 31, 183–192.

    Article  Google Scholar 

  • Pendharkar, P. C., Khosrowpour, M., & Rodger, J. A. (2000). Application of Bayesian network classifiers and data envelopment analysis for mining breast cancer patterns. The Journal of Computer Information Systems, 40(4), 127–132.

    Google Scholar 

  • Pendharkar, P. C., & Troutt, M. D. (2011). DEA based dimensionality reduction for classification problems satisfying strict non-satiety assumption. European Journal of Operational Research, 212, 155–163.

    Article  Google Scholar 

  • Seifert, J. W. (2004). Data mining: An overview. CRS Report for Congress, The Library of Congress, Order Code RL31798. http://www.fas.org/irp/crs/RL31798.pdf.

  • Seiford, L. M., & Zhu, J. (1998). An acceptance system decision rule with data envelopment analysis. Computers and Operations Research, 25(4), 329–332.

    Article  Google Scholar 

  • Sinha, A. P., & Zhao, H. (2008). Incorporating domain knowledge into data mining classifiers: An application in indirect lending. Decision Support Systems, 46, 287–299.

    Article  Google Scholar 

  • Troutt, M. D., Rai, A., & Zhang, A. (1996). The potential use of DEA for credit applicant acceptance systems. Computers and Operations Research, 23(4), 405–408.

    Article  Google Scholar 

  • Wei, Q. L., & Yan, H. (2001). A method of transferring polyhedron between the intersection-form and the sum-form. Computers and Mathematics with Application, 41, 1327–1342.

    Article  Google Scholar 

  • Wei, Q. L., & Yu, G. (1997). Analyzing the properties of K-cone in generalized data envelopment analysis model. Journal of Econometrics, 80, 63–84.

    Article  Google Scholar 

  • Yan, H., & Wei, Q. L. (2000). A method of transferring cones of intersection-form to cones of sum-form and its applications in DEA models. International Journal of Systems Science, 31(5), 629–638.

    Article  Google Scholar 

  • Yan, H., & Wei, Q. L. (2011). Data envelopment analysis classification machine. Information Science, 181, 5029–5041.

    Article  Google Scholar 

  • Ying, M. Q., Xu, R. E., & Wei, Q. L. (1975). Stability of mathematical programming. Acta Mathematical Sinica, 18(2), 123–175.

    Google Scholar 

  • Yu, G., Wei, Q. L., & Brockett, P. (1996). A generalized data envelopment analysis model: A unification and extension of existing methods for efficiency analysis of decision making units. Annals of Operations Research, 66, 47–89.

    Article  Google Scholar 

  • Zhu, J. (2003). Imprecise data envelopment analysis: A review and improvement with an application. European Journal of Operational Research, 144, 513–529.

    Article  Google Scholar 

Download references

Acknowledgments

This paper has benefited from the suggestions offered by the reviewers, and this assistance is gratefully acknowledged. In addition, the first and the third authors are partially supported by the National Natural Science Foundation of China, NNSF 71271208.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Tsung-Sheng Chang.

Appendices

Appendix 1: Proof of Theorem 1(i)

Theorem 1

Let \(L<\bar{\beta}<\hat{\beta},\) and

$$ T_{\bar{\beta}}=\left\{ x\left\vert \sum\limits_{j=1}^{n}x_{j}^{\bar{\beta}} \lambda_{j}\leq x,\,\sum\limits_{j=1}^{n}\lambda_{j}\geq1,\,\lambda_{j}\geq 0,\,j=1,\ldots,n\right. \right\}, $$

and

$$ T_{\hat{\beta}}=\left\{ x\left\vert \sum\limits_{j=1}^{n}x_{j}^{\hat{\beta}} \lambda_{j}\leq x,\,\sum\limits_{j=1}^{n}\lambda_{j}\geq1,\,\lambda_{j}\geq 0,\,j=1,\ldots,n\right. \right\}. $$

Then, \(T_{\hat{\beta}}\subset T_{\bar{\beta}}.\)

Proof

Since b j > a j , j = 1, …, n, if \(L<\bar{\beta} <\hat{\beta},\) then

$$ x_{j}^{\bar{\beta}}=a_{j}+\bar{\beta}\left( b_{j}-a_{j}\right) <a_{j} +\hat{\beta}\left( b_{j}-a_{j}\right) =x_{j}^{\hat{\beta}},\quad j=1,\ldots,n. $$

It follows that if \(\sum\nolimits_{j=1}^{n}\lambda_{j}\geq1,\,\lambda_{j} \geq0,\,j=1,\ldots,n,\) then

$$ \sum\limits_{j=1}^{n}x_{j}^{\bar{\beta}}\lambda_{j}< \sum\limits_{j=1}^{n}x_{j}^{\hat{\beta}}\lambda_{j}. $$

Thus, if \(x\in T_{\hat{\beta}},\) then \(x\in T_{\bar{\beta}};\) that is, \(T_{\hat{\beta}}\subset T_{\bar{\beta}}.\)

Appendix 2: Proof of Theorem 2

To prove Theorem 2, we first present the following two lemmas:

Lemma 1

If \(L<\bar{\beta}<\hat{\beta},\) and \(\tilde{x}\in T_{\hat{\beta}}=\left\{ x\left\vert \sum\nolimits_{j=1}^{n}x_{j}^{\hat{\beta}} \lambda_{j}\leq x,\,\sum\nolimits_{j=1}^{n}\lambda_{j}\geq1,\,\lambda_{j}\geq 0,\,j=1,\ldots,n\right. \right\}, \) then the optimal objective function value of the following linear program is less than one; i.e., \(\hat{\theta }( \bar{\beta}) <1.\)

$$ \begin{array}{ll} & \hat{\theta}( \bar{\beta}) =\min\theta,\\ \left( P_{\bar{\beta}}\right) & \hbox{s.t.}\; \sum\limits_{j=1} ^{n}x_{j}^{\bar{\beta}}\lambda_{j}\leq\theta\tilde{x},\\ & \sum\limits_{j=1}^{n}\lambda_{j}\geq1,\\ & \lambda_{j}\geq0,\quad j=1,\ldots,n. \end{array} $$

Proof

Let \(T_{\bar{\beta}}=\left\{ x\left\vert \sum\nolimits _{j=1}^{n}x_{j}^{\bar{\beta}}\lambda_{j}\leq x,\,\sum\nolimits_{j=1}^{n}\lambda_{j} \geq1,\,\lambda_{j}\geq0,\,j=1,\ldots,n\right. \right\}. \) Since \(\tilde {x}\in T_{\hat{\beta}},\) there exist \(\tilde{\lambda}_{1},\,\tilde{\lambda} _{2},\ldots,\tilde{\lambda}_{n}\) that satisfy

$$ \begin{aligned} &{\sum\limits_{j=1}^{n}x_{j}^{\hat{\beta}}{\tilde{\lambda}_{j}}\leq\tilde{x}},\\ &{{\sum\limits_{j=1}^{n}{\tilde{\lambda}_{j}}{\geq}1}},\\ &{{\tilde{\lambda}_{j}}{\geq} 0,\quad j=1,\ldots,n}. \end{aligned} $$

Furthermore, since \(\sum\nolimits_{j=1}^{n}\tilde{\lambda}_{j}\geq1,\, ( \tilde{\lambda}_{1},\,\tilde{\lambda}_{2},\ldots,\tilde{\lambda} _{n}) \neq0.\) Moreover, since \(a_{j}<b_{j},\, 0<x_{j}^{\bar{\beta}}<x_{j}^{\hat{\beta}},\) j = 1, …, n. In summary,

$$ \sum\limits_{j=1}^{n}x_{j}^{\bar{\beta}}\tilde{\lambda}_{j}<\sum\limits_{j=1}^{n}x_{j} ^{\hat{\beta}}\tilde{\lambda}_{j}\leq\tilde{x}. $$

It follows that there exist solutions to the following system of inequalities:

$$ \begin{aligned} &{\sum\limits_{j=1}^{n}x_{j}^{\bar{\beta}}\lambda_{j}<\tilde{x}},\\ &{\sum\limits_{j=1}^{n}\lambda_{j}\geq1},\\ &{\lambda_{j}\geq0,\quad j=1,\ldots,n}. \end{aligned} $$

As a result, \(\hat{\theta}( \bar{\beta}) <1\) (i.e., the optimal objective function value of \(( P_{\bar{\beta}})\) is less than one). □

Lemma 2

If \(L<\bar{\beta},\, \hat{x}>0\) and \(\hat{x}\notin T_{\beta},\) then the optimal objective function value of the following linear program is greater than one; i.e., \(\hat{\theta}( \beta) >1.\)

$$ \begin{array}{lll} && \hat{\theta}( \beta) =\min\theta,\\ \left( P_{\beta}\right) &\hbox{s.t.} &\sum\limits_{j=1}^{n} x_{j}^{\beta}\lambda_{j}\leq\theta\hat{x},\\ && \sum\limits_{j=1}^{n}\lambda_{j}\geq 1,\\ && \lambda_{j}\geq0,\quad j=1,\ldots,n. \end{array} $$

Proof

Let \(\hat{\lambda}_{1},\,\hat{\lambda}_{2} ,\ldots,\hat{\lambda}_{n}\) denote the optimal solution to \(( P_{\beta }) \) and \(\hat{\theta}( \beta) =\hat{\theta}.\) If \(\hat{\theta}( \beta) =\hat{\theta}\leq1,\) then

$$ \begin{aligned} &{\sum\limits_{j=1}^{n}x_{j}^{\beta}\hat{\lambda}_{j}\leq\hat{\theta}\hat{x}\leq \hat{x}},\\ &{\sum\limits_{j=1}^{n}\hat{\lambda}_{j}\geq1},\\ &{\hat{\lambda}_{j}\geq0,\quad j=1,\ldots,n}. \end{aligned} $$

That is, \(\hat{x}\in T_{\beta},\) which is a contradiction. □

In what follows, we give the proof to Theorem 2, first to (i) and then to (ii).

Theorem 2

Let \(\hat{x}\in\hat{T}\cap {\rm Int}\,\left\{ x|\sum\nolimits_{j=1}^{n}x_{j}^{L}\lambda_{j}\leq x,\,\sum\nolimits_{j=1}^{n}\lambda_{j} \geq1,\,\lambda_{j}\geq0,\,j=1,\ldots,n\right\},\) and \(\hat{\theta}( \beta) \) be the quantile function of DMU-\(\hat{x}.\) Then,

  1. (i)

    \(\hat{\theta}( \beta) \) is a continuous function defined over (L, +∞).

  2. (ii)

    \(\hat{\theta}( \beta) \) is a strictly monotonically decreasing function over (L, +∞).

Proof

  1. (i)

    Consider the following linear program \(( P_{\beta}):\)

    $$ \begin{array}{lll} && \hat{\theta}( \beta) =\min\theta,\\ \left( P_{\beta}\right) & \hbox{s.t.}& \sum\limits_{j=1}^{n} x_{j}^{\beta}\lambda_{j}\leq\theta\hat{x},\\ && \sum\limits_{j=1}^{n}\lambda_{j}\geq1,\\ && \lambda_{j}\geq0,\quad j=1,\ldots,n. \end{array} $$

    Equivalently,

    $$ \begin{array}{lll} && \hat{\theta}( \beta) =\min\theta,\\ \left( P_{\beta}\right) & \hbox{s.t.}& \sum\limits_{j=1}^{n}\left[ a_{j}+\beta\left( b_{j}-a_{j}\right) \right] \lambda_{j}\leq\theta\hat{x},\\ && \sum\limits_{j=1}^{n}\lambda_{j}\geq1,\\ && \lambda_{j}\geq0,\quad j=1,\ldots,n. \end{array} $$

    According to the stability of linear programming (Ying et al. 1975), the optimal objective function value of (P β ), \( \hat{\theta}(\beta), \) is a continuous function defined over (L, +∞).

  2. (ii)

    Let \(L<\bar{\beta}<\hat{\beta},\) and consider the following problem \(( P_{\hat{\beta}}):\)

    $$ \begin{array}{lll} && \hat{\theta}( \hat{\beta}) =\min\theta,\\ \left( P_{\hat{\beta}}\right) & \hbox{s.t.}& \sum\limits_{j=1} ^{n}x_{j}^{\hat{\beta}}\lambda_{j}\leq\theta\hat{x},\\ && \sum\limits_{j=1}^{n}\lambda_{j}\geq1,\\ && \lambda_{j}\geq0,\quad j=1,\ldots,n. \end{array} $$

    It is clear that \(\hat{\theta}( \hat{\beta}) \hat{x}\in T_{\hat{\beta}}.\) Consider also the following problem \(( \tilde{P}_{\bar{\beta}}):\)

    $$ \begin{array}{lll} && \hat{\theta}( \bar{\beta}) =\min\theta,\\ \left( \tilde{P}_{\bar{\beta}}\right) &\hbox{s.t.}& \sum\limits_{j=1}^{n}x_{j}^{\bar{\beta}}\lambda_{j}\leq\theta( \hat{\theta}( \hat{\beta}) \hat{x}), \\ && \sum\limits_{j=1}^{n}\lambda_{j}\geq1,\\ && \lambda_{j}\geq0,\quad j=1,\ldots,n. \end{array} $$

Let \(\tilde{\theta},\,\tilde{\lambda}_{1},\,\tilde{\lambda} _{2},\ldots,\tilde{\lambda}_{n}\) denote the optimal solution to \(( \tilde{P}_{\bar{\beta}}) .\) It is easy to check that \(\tilde{\theta} >0.\) Furthermore, since \(\hat{\theta}( \hat{\beta}) \hat{x}\in T_{\hat{\beta}}\) and \(L<\bar{\beta}<\hat{\beta},\) from Lemma 1, \(\tilde {\theta}<1.\) Moreover, since \(\hat{\theta}( \bar{\beta}) \) is the optimal objective function value of \(( \tilde{P}_{\bar{\beta}}) ,\, \hat{\theta}( \bar{\beta}) \leq\tilde{\theta}\hat{\theta }( \hat{\beta}) <\hat{\theta}( \hat{\beta}). \)

Appendix 3: Existence of β*

The following Theorem 3 shows the existence of β*.

Theorem 3

Let \(\hat{x}\in\hat{T}\cap {\rm Int}\,\left\{ x|\sum\nolimits_{j=1}^{n}x_{j}^{L}\lambda_{j}\leq x,\,\sum\nolimits_{j=1}^{n}\lambda_{j} \geq1,\,\lambda_{j}\geq0,\,j=1,\ldots,n\right\},\) and \(\hat{\theta}( \beta)\) be the quantile function of DMU-\(\hat{x}.\) Then, there exists β* ∈ (L, +∞) such that the optimal objective function value of the following problem (P β ) is equal to one; i.e., \(\hat{\theta}(\beta^{\ast})=1.\)

$$ \begin{array}{lll} && \hat{\theta}( \beta) =\min\theta,\\ \left( P_{\beta}\right) & \hbox{s.t.}& \sum\limits_{j=1}^{n}x_{j}^{\beta}\lambda_{j}\leq\theta\hat{x},\\ && \sum\limits_{j=1}^{n}\lambda_{j}\geq1,\\ && \lambda_{j}\geq0,\quad j=1,\ldots,n. \end{array} $$

Proof

  1. (i)

    If \(\hat{x}\) is located on the frontier of T 1, then \(\hat{\theta}( 1) =1,\) i.e., β* = 1.

  2. (ii)

    If \(\hat{x}\) is not located on the frontier of T 1, and \(\hat{x}\in{\rm Int}\,T_{1},\) then there exist \(\lambda_{j}^{0}\geq 0,\,j=1,\ldots,n,\,\sum\nolimits_{j=1}^{n}\lambda_{j}^{0}\geq1\) such that

    $$ \sum\limits_{j=1}^{n}\left[ a_{j}+1\times\left( b_{j}-a_{j}\right) \right] \lambda_{j}^{0}=\sum\limits_{j=1}^{n}b_{j}\lambda_{j}^{0}<\hat{x}, $$
    (1)

    and \(\hat{\theta}( 1) <1.\) Let

    $$ \hat{\beta}>\max\left\{ \underset{1\leq i\leq m;1\leq j\leq n}{\max}\left\{ \left( \hat{x}_{ij}-a_{ij}\right) /\left( b_{ij}-a_{ij}\right) \right\} ,\,L\right\}. $$

    Then,

    $$ a_{j}+\hat{\beta}\left( b_{j}-a_{j}\right) >\hat{x},\quad j=1,\ldots,n. $$

    Therefore, for any \(\lambda_{j}\geq0,\,j=1,\ldots,n,\,\sum\nolimits_{j=1}^{n} \lambda_{j}\geq1,\) we have

    $$ \sum\limits_{j=1}^{n}x_{j}^{\hat{\beta}}\lambda_{j}>\hat{x}. $$
    (2)

    From (2), \(\hat{x}\notin T_{\hat{\beta}},\) and from Lemma 2, \(\hat{\theta}( \hat{\beta}) >1.\) As a result, since \(\hat{\theta }( 1) <1,\,\hat{\theta}( \hat{\beta}) >1,\,\hat{\beta}\in(L,\,+\infty), \) from Theorem 2(i), \(\hat{\theta}( \beta) \) is a continuous function defined over (L, +∞). It follows that there exists \(\beta^{\ast}\in(L,\,+\infty) \) such that \(\hat{\theta}( \beta^{\ast}) =1.\)

  3. (iii)

    If \(\hat{x}\notin T_{1},\) from Lemma 2, \(\hat{\theta}( 1) >1.\) In addition, since

    $$ \hat{x}\in\text{Int}\,\left\{x |\sum\limits_{j=1}^{n}x_{j}^{L}\lambda_{j}\leq x,\,\sum\limits_{j=1}^{n}\lambda_{j}\geq1,\,\lambda_{j}\geq0,\,j=1,\ldots,n\right\}, $$

    there exist \(\lambda_{j}^{0}\geq0,\,j=1,\ldots,n,\,\sum\nolimits_{j=1}^{n} \lambda_{j}^{0}\geq1\) such that

    $$ \sum\limits_{j=1}^{n}\left[a_{j}+L\times\left( b_{j}-a_{j}\right) \right]\lambda_{j}^{0}=\sum\limits_{j=1}^{n}x_{j}^{L}\lambda_{j}^{0}<\hat{x}. $$

    Therefore, there exists \(\hat{\beta}\) that satisfies \(\hat{\beta}>L\) such that

    $$ \sum\limits_{j=1}^{n}x_{j}^{\hat{\beta}}\lambda_{j}^{0}<\hat{x}. $$

    That is, \(\hat{x}\in {\rm Int}\,T_{\hat{\beta}},\) and thus \(\hat{\theta}( \hat{\beta}) <1.\) Consequently, since \(\hat{\theta}( 1) >1,\,\hat{\theta}( \hat{\beta}) <1,\,\hat{\beta}\in(L,\,+\infty), \) from Theorem 2(i), \(\hat{\theta}( \beta) \) is a continuous function defined over (L, +∞). It follows that there exists β* ∈ (L, +∞) such that \(\hat{\theta }( \beta^{\ast}) =1.\)

Appendix 4: Uniqueness of β*

The following Theorem 4 shows the uniqueness of β*.

Theorem 4

Let \(b_{j}>a_{j},\,j=1,\ldots,n,\, L<\bar{\beta} <\hat{\beta},\) and \(\hat{x}\in\hat{T}.\) Then

  1. (i)

    There is no intersection between the frontiers of \(T_{\hat{\beta }}\) and \(T_{\bar{\beta}}.\)

  2. (ii)

    The quantile of DMU-\(\hat{x},\) i.e., β*, is uniquely determined.

Proof

The proof to (i) is achieved by contradiction. If there exists \(x^{0}\in\Re_{+}^{m},\) and x 0 is located on the frontiers of both \(T_{\hat{\beta}}\) and \(T_{\bar{\beta}},\) then, from Theorem 2, \(1=\hat{\theta }( \bar{\beta}) <\hat{\theta}( \hat{\beta}) =1,\) which is a contradiction. That is, there is no intersection between the frontiers of \(T_{\hat{\beta}}\) and \(T_{\bar{\beta}}.\)

The proof to (ii) is also achieved by contradiction. Assume that there exist two quantiles of DMU-\(\hat{x},\) i.e., \(\beta_{1}^{\ast}\) and \(\beta_{2}^{\ast}.\) Without loss of generality, assume that \(L<\beta_{1} ^{\ast}<\beta_{2}^{\ast}.\) Since both \(\beta_{1}^{\ast}\) and \(\beta_{2}^{\ast }\) are the quantiles of DMU-\(\hat{x},\,\hat{\theta}( \beta_{1}^{\ast }) =\hat{\theta}( \beta_{2}^{\ast}) =1.\) However, from Theorem 2, \(\hat{\theta}( \beta_{1}^{\ast}) <\hat{\theta}( \beta_{2}^{\ast}). \) That is, there is a contradiction. It follows that β* is uniquely determined. □

Rights and permissions

Reprints and permissions

About this article

Cite this article

Wei, Q., Chang, TS. & Han, S. Quantile–DEA classifiers with interval data. Ann Oper Res 217, 535–563 (2014). https://doi.org/10.1007/s10479-014-1565-y

Download citation

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s10479-014-1565-y

Keywords

Navigation