Summary
In machine fault-location, medical diagnosis, species identification, and computer decisionmaking, one is often required to identify some unknown object or condition, belonging to a known set of M possibilities, by applying a sequence of binary-valued tests, which are selected from a given set of available tests. One would usually prefer such a testing procedure which minimizes or nearly minimizes the expected testing cost for identification. Existing methods for determining a minimal expected cost testing procedure, however, require a number of operations which increases exponentially with M and become infeasible for solving problems of even moderate size. Thus, in practice, one instead uses fast, heuristic methods which hopefully obtain low cost testing procedures, but which do not guarantee a minimal cost solution. Examining the important case in which all M possibilities are equally likely, we derive a number of cost-bounding results for the most common heuristic procedure, which always applies next that test yielding maximum information gain per unit cost. In particular, we show that solutions obtained using this method can have expected cost greater than an arbitrary multiple of the optimal expected cost.
Similar content being viewed by others
References
Ash, R.: Information theory. New York: Interscience 1965, Sec. 2.5
Barnett, J.A., Gower, J. C.: Selection of tests for identifying yeasts (To appear)
Chang, H. Y.: An algorithm for selecting an optimum set of diagnostic tests. IEEE Trans. Electronic Computers EC-14, 706–711 (1965)
Chang, H. Y.: A distinguishability criterion for selecting efficient diagnostic tests. Proc. AFIPS 1968 SJCC 32, Montvale (N.J.): AFIPS Press 1968, p. 529–534
Garey, M. R.: Optimal binary decision trees for diagnostic identification problems. University of Wisconsin, Ph.D. Thesis, June 1970
Garey, M. R.: Optimal binary identification procedures. SIAM J. Appl. Math. 23, 173–186 (1972)
Garey, M. R., Graham, R. L.: To appear
Huffman, D. A.: A method for the construction of minimum redundancy codes. Proc. I.R.E. 40, 1098–1101 (1952)
LaMacchia, S. E.: Diagnosis in automatic checkout. I.R.E. Trans. Military Electronics MIL-6, 302–309 (1962)
Pankhurst, R. J.: A computer program for generating diagnostic keys. Computer J. 13, 145–151 (1970)
Sandelius, M.: On an optimal search procedure. Am. Math. Monthly 68, 133–134 (1961)
Seshu, S.: On an improved diagnosis program. IEE Trans. Electronic Computers EC-14, 76–79 (1965)
Shannon, C. E.: A mathematical theory of communication. Bell Syst. Tech. J. 27, 379–423, 623–656 (1948)
Zimmerman, S.: An optimal search procedure. Am. Math. Monthly 66, 690–693 (1959)
Author information
Authors and Affiliations
Additional information
The authors are indebted to R. L. Rivest for simplying the original proof of Theorem 1 and to P. J. Burke for his many valuable comments.
Rights and permissions
About this article
Cite this article
Garey, M.R., Graham, R.L. Performance bounds on the splitting algorithm for binary testing. Acta Informatica 3, 347–355 (1974). https://doi.org/10.1007/BF00263588
Received:
Issue Date:
DOI: https://doi.org/10.1007/BF00263588