Abstract
We propose a new approach to verification of probabilistic processes for which the model may not be available. We use a technique from Reinforcement Learning to approximate how far apart two processes are by solving a Markov Decision Process. If two processes are equivalent, the algorithm will return zero, otherwise it will provide a number and a test that witness the non equivalence. We suggest a new family of equivalences, called K-moment, for which it is possible to do so. The weakest, 1-moment equivalence, is trace-equivalence. The others are weaker than bisimulation but stronger than trace-equivalence.
Preview
Unable to display preview. Download preview PDF.
Similar content being viewed by others
References
Blute, R., Desharnais, J., Edalat, A., Panangaden, P.: Bisimulation for labelled Markov processes. In: Proc. of the Twelfth IEEE Symposium On Logic In Computer Science, Warsaw, Poland (1997)
van Breugel, F., Shalit, S., Worrell, J.B.: Testing labelled markov processes. In: Widmayer, P., Triguero, F., Morales, R., Hennessy, M., Eidenbenz, S., Conejo, R. (eds.) ICALP 2002. LNCS, vol. 2380, pp. 537–548. Springer, Heidelberg (2002)
Van Breugel, F., Worrell, J.: Approximating and computing behavioural distances in probabilistic transition systems. Theoretical Computer Science (2006)
Cover, T.M., Thomas, J.A.: Elements of Information Theory. Wiley, Chichester (1991)
Desharnais, J., Laviolette, F., Darsini Moturu, K., Zhioua, S.: Trace equivalence characterization through reinforcement learning. In: 19th Canadian Conference on Artificial Intelligence (2006) (accepted for publication)
Even-Dar, E., Mansour, Y.: Learning rates for Q-learning. In: Helmbold, D.P., Williamson, B. (eds.) COLT 2001 and EuroCOLT 2001. LNCS, vol. 2111, pp. 589–604. Springer, Heidelberg (2001)
Fiechter, C.N.: Design and Analysis of Efficient Reinforcement Learning Algorithms. PhD thesis, Univ. of Pittsburgh (1997)
Giacalone, A., Jou, C., Smolka, S.: Algebraic reasoning for probabilistic concurrent systems. In: Proceedings of the Working Conference on Programming Concepts and Methods. IFIP TC2 (1990)
Van Glabbeek, R.J.: The linear time - branching time spectrum ii. In: Best, E. (ed.) CONCUR 1993. LNCS, vol. 715, pp. 66–81. Springer, Heidelberg (1993)
Jou, C.-C., Smolka, S.A.: Equivalences, congruences, and complete axiomatizations for probabilistic processes. In: Baeten, J.C.M., Klop, J.W. (eds.) CONCUR 1990. LNCS, vol. 458. Springer, Heidelberg (1990)
Kaelbling, L.P., Littman, M.L., Moore, A.P.: Reinforcement learning: A survey. Journal of Artificial Intelligence Research 4, 237–285 (1996)
Kearns, M., Singh, S.: Finite-sample convergence rates for q-learning and indirect algorithms. In: Proc. of the 1998 conference on Advances in neural information processing systems II, pp. 996–1002. MIT Press, Cambridge (1999)
Larsen, K.G., Skou, A.: Bisimulation through probabilistic testing. Inf. Comput. 94(1), 1–28 (1991)
Lowe, G.: Representing Nondeterministic and Probabilistic Behaviour in Reactive Processes. Technical report, Progr. Res. Group, Oxford University (1993)
Sutton, R.S., Barto, A.G.: Introduction to Reinforcement Learning. MIT Press, Cambridge (1998)
Watkins, C.: Learning from Delayed Rewards. PhD thesis, Univ. of Cambridge (1989)
Watkins, C., Dayan, P.: Q-learning. Machine Learning 8, 279–292 (1992)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2006 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Desharnais, J., Laviolette, F., Zhioua, S. (2006). Testing Probabilistic Equivalence Through Reinforcement Learning. In: Arun-Kumar, S., Garg, N. (eds) FSTTCS 2006: Foundations of Software Technology and Theoretical Computer Science. FSTTCS 2006. Lecture Notes in Computer Science, vol 4337. Springer, Berlin, Heidelberg. https://doi.org/10.1007/11944836_23
Download citation
DOI: https://doi.org/10.1007/11944836_23
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-49994-7
Online ISBN: 978-3-540-49995-4
eBook Packages: Computer ScienceComputer Science (R0)