Skip to main content
Log in

Estimation of distributions involving unobservable events: the case of optimal search with unknown Target Distributions

  • Theoretical Advances
  • Published:
Pattern Analysis and Applications Aims and scope Submit manuscript

Abstract

We consider the problem of estimating the parameters of a distribution when the underlying events are themselves unobservable. The aim of the exercise is to perform a task (for example, search a web-site or query a distributed database) based on a distribution involving the state of nature, except that we are not allowed to observe the various “states of nature” involved in this phenomenon. In particular, we concentrate on the task of searching for an object in a set of N locations (or bins) {C 1, C 2,…, C N }, in which the probability of the object being in the location C i is p i , where P = [p 1, p 2,…, p N ]T is called the Target Distribution. Also, the probability of locating the object in the bin within a specified time, given that it is in the bin, is given by a function called the Detection function, which, in its most common instantiation, is typically, specified by an exponential function. The intention is to allocate the available resources so as to maximize the probability of locating the object. The handicap, however, is that the time allowed is limited, and thus the fact that the object is not located in bin C i within a specified time does not necessarily imply that the object is not in C i . This problem has applications in searching large databases, distributed databases, and the world-wide web, where the location of the files sought for are unknown, and in developing various military and strategic policies. All of the research done in this area has assumed the knowledge of the {p i }. In this paper we consider the problem of obtaining error bounds, estimating the Target Distribution, and allocating the search times when the {p i } are unknown. To the best of our knowledge, these results are of a pioneering sort - they are the first available results in this area, and are particularly interesting because, as mentioned earlier, the events concerning the Target Distribution, in themselves, are unobservable.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1

Similar content being viewed by others

Notes

  1. It is possible to get very good estimates of θ if one is provided with random occurrences of a known function of X. Thus, instead of receiving {X i }, if we are provided with {Y i }, where, for example, if each Y i  = X 2 i , an MLE can be easily devised to estimate θ by observing the Y i ’s.

  2. Even though the review is extensive, we have tried to keep it crisp and “to the point”.

  3. Typically, the value of p 0 should satisfy 0.5  ≤  p 0  ≤  1.

  4. We are not aware of anyone who has any real-life data on this. While this is generally a long-term goal for learning from the web, we believe that, for the most part, this is an unsolved problem.

  5. These results are based on the joint of work of the second author and his former student, Mr. Amr Ellaithy. The details of these results and other simulations (which are still being compiled) are to be included in a forthcoming paper jointly co-authored by Mr. Ellaithy and the second author.

  6. In the simulation, this essentially meant assigning the binary value of that slot to unity.

  7. For example, consider the scenario of searching for a missing climber in a mountain. Suppose a person was known to be at point x 0 at time 0. Then, when the search begins at time t, the target is distributed in an area that is centered at x 0, and has a radius vt, where v is the maximum speed at which the man can move. Due to the life-threatening risk in question, and due to the fact that we do not have time to repeat the search process to estimate and compare the detection probability, intelligent search techniques akin to what we have explained can be extremely crucial to the search endeavour.

References

  1. Arkin VI (1964) A problem of optimum distribution of search effort. Theory Probab Appl 9:159–160

    Article  MATH  Google Scholar 

  2. Badr G, Oommen BJ (2006) A novel look-ahead optimization strategy for trie-based approximate string matching. Pattern Anal Appl J 9(2–3):177–187

    Article  MathSciNet  Google Scholar 

  3. Bentley JL, Yao AC-C (1976) An almost optimal algorithm for unbounded searching. Inf Process Lett 5:82–87

    Article  MATH  MathSciNet  Google Scholar 

  4. Benichou O, Coppey M, Moreau M, Suet PH, Voituriez R (2005) Optimal search strategies for hidden targets. Phys Rev Lett 94:198101

    Article  Google Scholar 

  5. Bickel P, Doksum K (2000) Mathematical statistics: basic ideas and selected topics, vol I, 2nd edn. Prentice-Hall, Englewood Cliffs

    Google Scholar 

  6. Buhrman H, Franklin M, Garay JA, Hoepman J-H, Tromp J, Vitnyi P (1999) Mutual search. J ACM 46:517–536

    Article  MATH  MathSciNet  Google Scholar 

  7. Calitoiu D, Oommen BJ, Nussbaum D (2007) Periodicity and stability issues of a chaotic pattern recognition neural network. Pattern Anal Appl J 10(3):175–188

    Article  MathSciNet  Google Scholar 

  8. Casella G, Berger R (2001) Statistical inference, 2nd edn. Brooks/Cole Pub Co.

  9. Chandramouli R (2004) Web search steganalysis: Some challenges and approaches. In: Proceedings of the IEEE ISCAS: special session on information hiding

  10. Charness A, Cooper WW (1958) The theory of search: optimum distribution of search effort. Manage Sci 5:44–50

    Article  Google Scholar 

  11. Chew MC (1967) A sequential search procedure. Ann Math Stat 38:494–502

    Article  MATH  MathSciNet  Google Scholar 

  12. Dasgupta B, Hespanha JP, Sontag E (2004) Computational complexities of honey-pot searching with local sensory information. In: Proceedings of ACC 2004, the American Control Conference, pp 2134–2138

  13. De Guenin J (1961) Optimum distribution of effort: an extension of the Koopman basic theory. Oper Res 9:1–7

    Article  MathSciNet  Google Scholar 

  14. Dobbie JM (1963) Search theory: a sequential approach. Nav Res Logist Q 10:323–334

    Article  MATH  MathSciNet  Google Scholar 

  15. Duda R, Hart P, Stork D (2000) Pattern classification, 2nd edn. Wiley, New York

    Google Scholar 

  16. Fukunaga K (1990) Introduction to statistical pattern recognition. Academic, London

    MATH  Google Scholar 

  17. Gage DW (1995) Many-Robot MCM search systems. In: Proceedings of the symposium of autonomous vehicles in mine countermeasures, Monterey CA, pp 4–7

  18. Gal S (1979) Search games with mobile and immobile hider. SIAM J Control Optim 17:99–122

    Article  MATH  MathSciNet  Google Scholar 

  19. Gilbert EN (1959) Optimal search strategies. SIAM J Appl Math 7:413–424

    Article  MATH  Google Scholar 

  20. Gluss B (1959) An optimum policy for detecting a fault in a complex system. Oper Res 7:468–477

    Article  Google Scholar 

  21. Gluss B (1961) Approximately optimal one-dimensional search policies in which search costs vary through time. Nav Res Logist Q 8:277–283

    Article  MATH  MathSciNet  Google Scholar 

  22. Herbrich R (2001) Learning Kernel classifiers: theory and algorithms. MIT, Cambridge

    Google Scholar 

  23. Jones B, Garthwaite P, Jolliffe I (2002) Statistical inference, 2nd edn. Oxford University Press, Oxford

    MATH  Google Scholar 

  24. Kadane JB (1968) Discrete search and the Neyman–Pearson Lemma. J Math Anal Appl 22:156–171

    Article  MATH  MathSciNet  Google Scholar 

  25. Kadane JB (1971) Optimal whereabouts search. Oper Res 19:894–904

    Article  MATH  MathSciNet  Google Scholar 

  26. Kadane JB, Simon HA (1977) Optimal strategies for a class of constrained sequential problems. Ann Stat 5:237–255

    Article  MATH  MathSciNet  Google Scholar 

  27. Kashyap RL, Oommen BJ (1983) Scale preserving smoothing of polygons. IEEE Trans Pattern Anal Mach Intell 5(6):667–671

    MATH  Google Scholar 

  28. Kisi T (1966) On an optimal searching schedule. J Oper Res Soc Japan 8:53–65

    MATH  Google Scholar 

  29. Koopman BO (1946) Search and screening. OEG Report, no. 56, Center for Naval Analysis, Rosslyn, Va., USA

  30. Koopman BO (1956) The theory of search. Part I: Kinetic bases. Oper Res 4:324–346

    Article  MathSciNet  Google Scholar 

  31. Koopman BO (1956) The theory of search. Part II: Target detection. Oper Res 4:503–531

    Article  MathSciNet  Google Scholar 

  32. Koopman BO (1957) The theory of search. Part III: The optimum distribution of searching effort. Oper Res 5:613–626

    Article  MathSciNet  Google Scholar 

  33. Mela DF (1961) Information theory and search theory as special cases of decision theory. Oper Res 9:907–909

    Article  MATH  Google Scholar 

  34. Oommen BJ (1997) Stochastic searching on the line and its applications to parameter learning in non-linear optimization. IEEE Trans Syst Man Cybernet 27:733–739

    Article  Google Scholar 

  35. Oommen BJ, Badr G (2007) Breadth-first search strategies for trie-based syntactic pattern recognition. Pattern Anal Appl J 10:1–13

    Article  MathSciNet  Google Scholar 

  36. Oommen BJ, Raghunath G (1998) Automata learning and intelligent tertiary searching for stochastic point location. IEEE Trans Syst Man Cybernet SMC-28B:947–954

    Article  Google Scholar 

  37. Onaga K (1971) Optimal search for detecting a hidden object. SIAM J Appl Math 20:298–318

    Article  MATH  MathSciNet  Google Scholar 

  38. Pao Y-H (1989) Adaptive pattern recognition and neural networks. Addison-Wesley, Reading

    MATH  Google Scholar 

  39. Pavlidis T (1977) Structural pattern recognition. Springer, New York

    MATH  Google Scholar 

  40. Pelc A (1989) Searching with known error probability. Theor Comput Sci 63:185–202

    Article  MATH  MathSciNet  Google Scholar 

  41. Press WH, Flannery BP, Teukolsky SA, Vetterling WT (1986) Numerical recipes: the art of scientific computing. Cambridge University Press, Cambridge

    Google Scholar 

  42. Rao SS (1984) Optimization: theory and applications, 2nd edn. Wiley, New Delhi, pp 613–626

    MATH  Google Scholar 

  43. Rezaiifar R, Makowski AM (1997) From optimal search theory to sequential paging in cellular networks. IEEE J Sel Areas Commun 15(7):1253–1264

    Article  Google Scholar 

  44. Ross S (2002) Introduction to probability models, 2nd edn. Academic, New York

    Google Scholar 

  45. Santharam G, Sastry PS, Thathachar MAL (1994) Continuous action set learning automata for stochastic optimization. J Franklin Inst 331B5:607–628

    Article  MathSciNet  Google Scholar 

  46. Shao J (2003) Mathematical statistics, 2nd edn. Springer, Heidelberg

    MATH  Google Scholar 

  47. Sprinthall J (2002) Basic statistical analysis, 2nd edn. Allyn and Bacon, Boston

    Google Scholar 

  48. Staroverov OV (1963) On a searching problem. Theory Probab Appl 8:184–187

    Article  MATH  MathSciNet  Google Scholar 

  49. Stone LD (1972) Incremental approximation of optimal allocations. Nav Res Logist Q 19:111–122

    Article  MATH  Google Scholar 

  50. Stone LD (1973) Total optimality of incrementally optimal allocations. Nav Res Logist Q 19:419–430

    Article  Google Scholar 

  51. Stone LD (1976) Incremental and total optimization of separable functionals with constraints. SIAM J Control Optim 14:791–802

    Article  MATH  Google Scholar 

  52. Tognetti KP (1968) An optimal strategy for a whereabouts search. Oper Res 16:209–211

    Article  Google Scholar 

  53. Wasserman PD (1989) Neural computing: theory and practice. van Nostrand Reinhold, New York

    Google Scholar 

  54. Webb A (2002) Statistical pattern recognition, 2nd edn. Wiley, New York

    MATH  Google Scholar 

  55. Wegener I (1981) The construction of an optimal distribution of search effort. Nav Res Logist Q 28(4):533–543

    Article  MATH  MathSciNet  Google Scholar 

  56. Wegener I (1982) The discrete search problem and the construction of optimal allocations. Nav Res Logist Q 29(2):533–543

    Article  MathSciNet  Google Scholar 

  57. Weisinger JR, Benkoski SJ (1989) Optimal layered search. Nav Res Logist 36:43–60

    Article  MATH  MathSciNet  Google Scholar 

  58. Williams RJ (1992) Simple statistical gradient-following algorithms for connectioninst reinforcement learning. Mach Learn 8:229–256

    MATH  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to B. John Oommen.

Additional information

B. J. Oommen is Fellow of the IEEE and IAPR. The work of B. J. Oommen was partially supported by the Natural Sciences and Engineering Research Council of Canada.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Zhu, Q., Oommen, B.J. Estimation of distributions involving unobservable events: the case of optimal search with unknown Target Distributions. Pattern Anal Applic 12, 37–53 (2009). https://doi.org/10.1007/s10044-007-0095-5

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s10044-007-0095-5

Keywords

Navigation