Skip to main content

Advertisement

Log in

Prediction of visual attention with deep CNN on artificially degraded videos for studies of attention of patients with Dementia

  • Published:
Multimedia Tools and Applications Aims and scope Submit manuscript

Abstract

Studies of visual attention of patients with Dementia such as Parkinson’s Disease Dementia and Alzheimer Disease is a promising way for non-invasive diagnostics. Past research showed, that people suffering from dementia are not reactive with regard to degradations on still images. Attempts are being made to study their visual attention relatively to the video content. Here the delays in their reactions on novelty and “unusual” novelty of the visual scene are expected. Nevertheless, large-scale screening of population is possible only if sufficiently robust automatic prediction models can be built. In the medical protocols the detection of Dementia behavior in visual content observation is always performed in comparison with healthy, “normal control” subjects. Hence, it is a research question per see as to develop an automatic prediction models for specific visual content to use in psycho-visual experience involving Patients with Dementia (PwD). The difficulty of such a prediction resides in a very small amount of training data. In this paper the reaction of healthy normal control subjects on degraded areas in videos was studied. Furthermore, in order to build an automatic prediction model for salient areas in intentionally degraded videos for PwD studies, a deep learning architecture was designed. Optimal transfer learning strategy for training the model in case of very small amount of training data was deployed. The comparison with gaze fixation maps and classical visual attention prediction models was performed. Results are interesting regarding the reaction of normal control subjects against degraded areas in videos.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8
Fig. 9

Similar content being viewed by others

Notes

  1. available in http://www.di.ens.fr/~laptev/actions/hollywood2/

References

  1. Ankri J, Hesse C, Renucci A, Martineau A (2013) Evaluation du plan alzheimer 2008-2012

  2. Archibald NK, Hutton SB, Clarke MP, Mosimann UP, Burn DJ (2013) Visual exploration in parkinson’s disease and parkinson’s disease dementia. Brain Journal de Neurologie

  3. Borji A, Itti L (2013) State-of-the-art in visual attention modeling. IEEE Trans Pattern Anal Mach Intell 35(1):185–207

    Article  Google Scholar 

  4. Boujut H, Benois-pineau J, Mégret R (2012) Fusion of multiple visual cues for visual saliency extraction from wearable camera settings with strong motion Computer vision – ECCV 2012. Workshops and demonstrations - florence, Italy, October 7-13, 2012, Proceedings, Part III, pp 436–445

    Google Scholar 

  5. CS231n (2016) Convolutional Neural Networks for Visual Recognition

  6. Chaabouni S, Benois-Pineau J, Ben Amar C (2016) Transfer learning with deep networks for saliency prediction in natural video 2016 IEEE international conference on image processing, ICIP 2016, vol 91

  7. Chaabouni S, Benois-Pineau J, Hadar O (2016) Prediction of visual saliency in video with deep cnns Proceedings SPIE, vol 9971, pp 99,711Q–99,711Q–14

  8. Chaabouni S, Benois-Pineau J, Hadar O, Ben Amar C (2016) Deep learning for saliency prediction in natural video. arXiv:1604.08010

  9. Chaabouni S, Benois-Pineau J, Tison F, Ben Amar C (2016) Prediction of visual attention with deep CNN for studies of neurodegenerative diseases 2016 14th international workshop on content-based multimedia indexing (CBMI), pp 1–6

    Google Scholar 

  10. Gitchel G, Wetzel P, Baron M (2012) Pervasive ocular tremor in patients with parkinson disease. Arch Neurol

  11. Itti L, Koch C, Niebur E (1998) A model of Saliency-Based visual attention for rapid scene analysis. IEEE Trans Pattern Anal Mach Intell 20(11):1254–1259

    Article  Google Scholar 

  12. ITU-R: Recommendation 500-11:(2002) Methodology for the subjective assessment of the quality of television pictures. ITU-R Rec. BT.500-11

  13. Jia Y, Shelhamer E, Donahue J, Karayev S, Long J, Girshick R, Guadarrama S, Darrell T (2014) Caffe: Convolutional architecture for fast feature embedding. arXiv:1408.5093

  14. Khefifi W L’alzheimer en Tunisie : Une maladie en pleine expansion

  15. Kruthiventi SSS, Gudisa V, Dholakiya JH, Babu RV (2016) Saliency unified: A deep architecture for simultaneous eye fixation prediction and salient object segmentation 2016 IEEE conference on computer vision and pattern recognition (CVPR) 00. doi:10.1109/CVPR.2016.623, pp 5781–5790

    Chapter  Google Scholar 

  16. Kümmerer M, Theis L, Bethge M (2014) Deep gaze I: boosting saliency prediction with feature maps trained on imagenet. CoRR arXiv:1411.1045

  17. Lappi OJPHIT (2013) Pursuit eye-movements in curve driving differentiate between future path and tangent point models. PLoS One 8(7). doi:10.1371/journal.pone.0068326

  18. Le Meur O, Baccino T (2012) Methods for comparing scanpaths and saliency maps: strengths and weaknesses. Behav Res Methods 45(1):1–16

    Google Scholar 

  19. Lu Y, Li Z, Zhang X, Ming B, Jia J, Wang R, Ma D (2010) Retinal nerve fiber layer structure abnormalities in early Alzheimer’s disease: evidence in optical coherence tomography. Neurosci Lett (480):69–72

  20. Mai L, Le H, Niu Y, Liu F (2011) Rule of thirds detection from photograph 2011 IEEE international symposium on Multimedia (ISM), pp 91–96

    Chapter  Google Scholar 

  21. Marat S (2010) Modèles de saillance visuelle par fusion d’informations sur la luminance, le mouvement et les visages pour la prédiction de mouvements oculaires lors de l’exploration de vidéos. Ph.D. thesis université de grenoble

  22. Marszałek M, Laptev I, Schmid C (2009) Actions in context IEEE conference on computer vision & pattern recognition

    Google Scholar 

  23. Mathe S, Sminchisescu C (2015) actions in the eye: Dynamic gaze datasets and learnt saliency models for visual recognition. IEEE Trans Pattern Anal Mach Intell, 37

  24. Pan J, Giró i Nieto X (2015) End-to-end convolutional network for saliency prediction. CoRR arXiv:1507.01422

  25. Pinto Y, van der Leij AR, Sligte IG, Lamme VF, Scholte HS (2013) Bottom-up and top-down attention are independent. J Vis 13(3):16

  26. Seo HJ, Milanfar P (2009) Static and space-time visual saliency detection by self-resemblance. J Vis 9(12:15):1–27

    Google Scholar 

  27. Shen C, Zhao Q (2014) Learning to predict eye fixations for semantic contents using multi-layer sparse network. Neurocomputing 138:61–68

    Article  Google Scholar 

  28. Simonyan K, Vedaldi A, Zisserman A (2013) deep inside convolutional networks: Visualising image classification models and saliency maps. CoRR arXiv:1312.6034

  29. Tison F, Chene G (2013) Les Yeux l’ont: anomalies des saccades oculaires à la phase prodromale de la maladie d’alzheimer ACRONYME : LYLO PROTOCOLE DE RECHERCHE BIOMEDICALE Version n3.0 du 09/10/2013

  30. Treisman AM, Gelade G (1980) A feature-integration theory of attention. Cogn Psychol 12(1):97–136

    Article  Google Scholar 

  31. Tseng P, Cameron IGM, Pari G, Reynolds JN, Munoz DP, Itti L (2013) High-throughput classification of clinical populations from natural viewing eye movements. J Neurol 260:275– 284

    Article  Google Scholar 

  32. Vig E, Dorr M, Cox D (2014) Large-Scale optimization of hierarchical features for saliency prediction in natural images. In: IEEE computer vision and pattern recognition (CVPR)

  33. Wooding DS (2002) Eye movements of large populations: II. Deriving regions of interest, coverage, and similarity using fixation maps. Behav Res Methods Instrum Comput 34(4):518– 528

    Article  Google Scholar 

  34. Yosinski J, Clune J, Bengio Y, Lipson H (2014) How transferable are features in deep neural networks? In: Ghahramani Z, Welling M, Cortes C, Lawrence N, Weinberger K (eds) Advances in neural information processing systems 27. Curran Associates, Inc, pp 3320–3328

Download references

Acknowledgements

This research has been supported by University of Bordeaux, University of Sfax and the grant UNetBA.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Souad Chaabouni.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Chaabouni, S., Benois-pineau, J., Tison, F. et al. Prediction of visual attention with deep CNN on artificially degraded videos for studies of attention of patients with Dementia. Multimed Tools Appl 76, 22527–22546 (2017). https://doi.org/10.1007/s11042-017-4796-5

Download citation

  • Received:

  • Revised:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s11042-017-4796-5

Keywords

Navigation