Skip to main content

Predictive Inference Model of the Physical Environment that Emulates Predictive Coding

  • Conference paper
  • First Online:
Discovery Science (DS 2023)

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 14276))

Included in the following conference series:

  • 956 Accesses

Abstract

In recent years, the significance of artificial intelligence in comprehending the real-world has increased, by leveraging the inherent ability of humans to process intuitive physics on a computer. Prior investigations on real-world understanding have mainly relied on image inference to recognize the physical environment. In contrast, we propose an inference model that can predict the observed environment using both visual and physical features, emulating the predictive coding hypothesized to occur in the human brain, and detects change points in response to predictive events. Additionally, the model verifies the correctness of the timing of important physical events of objects, such as object collisions and disappearances. Furthermore, the results of the physical information prediction are also described as natural language sentences to confirm whether the model accurately recognizes the real-world and predicts the next behavior based on the physical information.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Subscribe and save

Springer+ Basic
$34.99 /Month
  • Get 10 units per month
  • Download Article/Chapter or eBook
  • 1 Unit = 1 Article or 1 Chapter
  • Cancel anytime
Subscribe now

Buy Now

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Similar content being viewed by others

References

  1. Bear, D.M., et al.: Physion: evaluating physical prediction from vision in humans and machines (2021)

    Google Scholar 

  2. Chang, Z., Zhang, X., Wang, S., Ma, S., Gao, W.: STIP: A SpatioTemporal Information-Preserving and Perception-Augmented model for High-Resolution video prediction (2022)

    Google Scholar 

  3. Chen, Z., et al.: ComPhy: compositional physical reasoning of objects and events from videos (2022)

    Google Scholar 

  4. Ding, M., Chen, Z., Du, T., Luo, P., Tenenbaum, J.B., Gan, C.: Dynamic visual reasoning by learning differentiable physics models from video and language (2021)

    Google Scholar 

  5. Dosovitskiy, A., et al.: An image is worth 16x16 words: Transformers for image recognition at scale (2020)

    Google Scholar 

  6. Duan, J., Dasgupta, A., Fischer, J., Tan, C.: A survey on machine learning approaches for modelling intuitive physics (2022)

    Google Scholar 

  7. Gao, Z., Tan, C., Wu, L., Li, S.Z.: SimVP: Simpler yet better video prediction (2022)

    Google Scholar 

  8. Ge, J., et al.: Learning the relation between similarity loss and clustering loss in Self-Supervised learning (2023)

    Google Scholar 

  9. Ha, D., Schmidhuber, J.: World models (2018)

    Google Scholar 

  10. Hafner, D., Lillicrap, T., Ba, J., Norouzi, M.: Dream to control: Learning behaviors by latent imagination (2019)

    Google Scholar 

  11. Hochreiter, S., Schmidhuber, J.: Long short-term memory. Neural Comput. 9(8), 1735–1780 (1997)

    Article  Google Scholar 

  12. Johnson, J., Hariharan, B., van der Maaten, L., Fei-Fei, L., Zitnick, C.L., Girshick, R.B.: CLEVR: A diagnostic dataset for compositional language and elementary visual reasoning. CoRR abs/1612.06890 (2016), http://arxiv.org/abs/1612.06890

  13. Kandukuri, R.K., Achterhold, J., Moeller, M., Stueckler, J.: Physical representation learning and parameter identification from video using differentiable physics. Int. J. Comput. Vis. 130(1), 3–16 (2022)

    Article  MATH  Google Scholar 

  14. Kim, T., Ahn, S., Bengio, Y.: Variational temporal abstraction. CoRR abs/1910.00775 (2019), http://arxiv.org/abs/1910.00775

  15. Kingma, Ba: Adam: A method for stochastic optimization. arXiv:1412.6980 (2017)

  16. LeCun, Y.: A path towards autonomous machine intelligence

    Google Scholar 

  17. Lee, S., Kim, H.G., Choi, D.H., Kim, H.I., Ro, Y.M.: Video prediction recalling long-term motion context via memory alignment learning (2021)

    Google Scholar 

  18. Li, Z., Zhu, X., Lei, Z., Zhang, Z.: Deconfounding physical dynamics with global causal relation and confounder transmission for counterfactual prediction. AAAI 36(2), 1536–1545 (2022)

    Article  Google Scholar 

  19. Lin, Z., Li, M., Zheng, Z., Cheng, Y., Yuan, C.: Self-Attention ConvLSTM for spatiotemporal prediction. AAAI 34(07), 11531–11538 (2020)

    Article  Google Scholar 

  20. Lotter, Kreiman, Cox: Deep predictive coding networks for video prediction and unsupervised learning. arXiv:1605.08104 (2017)

  21. Lotter, W., Kreiman, G., Cox, D.: A neural network trained to predict future video frames mimics critical properties of biological neuronal responses and perception (2018)

    Google Scholar 

  22. Mao, J., Yang, X., Zhang, X., Goodman, N., Wu, J.: CLEVRER-Humans: Describing physical and causal events the human way (2022)

    Google Scholar 

  23. Pan, M., Zhu, X., Wang, Y., Yang, X.: Iso-Dream: Isolating and leveraging noncontrollable visual dynamics in world models (2022)

    Google Scholar 

  24. Papineni, K., Roukos, S., Ward, T., Zhu, W.J.: BLEU: a method for automatic evaluation of machine translation. In: Proceedings of the 40th Annual Meeting on Association for Computational Linguistics, ACL 2002, pp. 311–318. Association for Computational Linguistics, USA (2002)

    Google Scholar 

  25. Piloto, L.S., Weinstein, A., Battaglia, P., Botvinick, M.: Intuitive physics learning in a deep-learning model inspired by developmental psychology. Nat. Hum. Behav. 6(9), 1257–1267 (2022). https://doi.org/10.1038/s41562-022-01394-8

    Article  Google Scholar 

  26. Tang, Q., Zhu, X., Lei, Z., Zhang, Z.: Intrinsic physical concepts discovery with Object-Centric predictive models (2023)

    Google Scholar 

  27. Vaswani, A., et al.: Attention is all you need. CoRR abs/1706.03762 (2017). http://arxiv.org/abs/1706.03762

  28. Wang, Y., Gao, Z., Long, M., Wang, J., Yu, P.S.: PredRNN++: Towards a resolution of the Deep-in-Time dilemma in spatiotemporal predictive learning (2018)

    Google Scholar 

  29. Wang, Y., et al.: PredRNN: a recurrent neural network for spatiotemporal predictive learning (2021)

    Google Scholar 

  30. Wu, B., Yu, S., Chen, Z., Tenenbaum, J.B., Gan, C.: STAR: a benchmark for situated reasoning in Real-World videos (2022)

    Google Scholar 

  31. Ye, T., Wang, X., Davidson, J., Gupta, A.: Interpretable intuitive physics model. In: Proceedings of (ECCV) European Conference on Computer Vision, pp. 89–105 (2018)

    Google Scholar 

  32. Yi, K., et al.: CLEVRER: CoLlision events for video REpresentation and reasoning. arXiv:1910.01442 (2020)

  33. Yi, K., et al.: Clevrer: collision events for video representation and reasoning. In: ICLR (2020)

    Google Scholar 

Download references

Acknowledgement

This work was supported by the Japan Society for the Promotion of Science KAKENHI Grant Numbers JP22J21786, JP22KJ1355, 23H03453 and JSPS Bilateral Program Number JPJSBP120213504.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Eri Kuroda .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2023 The Author(s), under exclusive license to Springer Nature Switzerland AG

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Kuroda, E., Kobayashi, I. (2023). Predictive Inference Model of the Physical Environment that Emulates Predictive Coding. In: Bifet, A., Lorena, A.C., Ribeiro, R.P., Gama, J., Abreu, P.H. (eds) Discovery Science. DS 2023. Lecture Notes in Computer Science(), vol 14276. Springer, Cham. https://doi.org/10.1007/978-3-031-45275-8_29

Download citation

  • DOI: https://doi.org/10.1007/978-3-031-45275-8_29

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-031-45274-1

  • Online ISBN: 978-3-031-45275-8

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics