Skip to main content

Visual Digest Networks

  • Chapter
Digital Human Modeling

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 4650))

  • 994 Accesses

Abstract

Attention, understanding and abstraction are three key elements in our visual communication that we have taken for granted. These interconnected elements constitute a Visual Digest Network. In this chapter, we investigate the conceptual design of Visual Digest Networks at three visual abstraction levels: gaze, object and word. The goal is to minimize the media footprint during visual communication while sustaining essential semantic communication. The Attentive Video Network is designed to detect the operator’s gaze and adjust the video resolution at the sensor side across the network. Our results show significant improvements in network bandwidth utilization. The Object Video Network is designed for mobile video network applications, where faces and cars are detected. The multi-resolution profiles are configured for media according to the network footprint. The video is sent across the network with multiple resolutions and metadata; controlled by the bandwidth regulator. The results show that the video can be transmitted in the low-bandwidth conditions. Finally, the Image-Word Search Network is designed for face reconstruction across the network. In this study, we assume the hidden layer between the facial features and referral expressive words contain ‘control points’ that can be articulated mathematically, visually and verbally. This experiment is a crude model of the semantic network. Nevertheless, we see the potential of the twoway mapping.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 39.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Cai, Y.: How Many Pixels Do We Need to See Things? In: Sloot, P.M.A., Abramson, D., Bogdanov, A.V., Gorbachev, Y.E., Dongarra, J., Zomaya, A.Y. (eds.) ICCS 2003. LNCS, vol. 2659, pp. 1064–1073. Springer, Heidelberg (2003)

    Chapter  Google Scholar 

  2. Arnham, R.: Visual Thinking. University of California Press (1969)

    Google Scholar 

  3. Allport, A.: Visual Attention. MIT Press, Cambridge (1993)

    Google Scholar 

  4. Web site, http://www.webexhibits.org/colorart/ag.html

  5. Yarbus, A.L.: Eye Movements during Perception of Complex Objects. Plenum Press, New York (1967)

    Book  Google Scholar 

  6. Larkin, J.H., Simon, H.A.: Why a diagram is (sometimes) worth 10,000 words. Cognitive Science 11, 65–100 (1987)

    Article  Google Scholar 

  7. Duchowski, A.T., et al.: Gaze-Contingent Displays: A Review. Cyber-Psychology and Behavior 7(6) (2004)

    Google Scholar 

  8. Kortum, P., Geisler, W.: Implementation of a foveated image coding system for image bandwidth reduction. In: SPIE Proceedings, vol. 2657, pp. 350–360 (1996)

    Google Scholar 

  9. Geisler, W.S., Perry, J.S.: Real-time foveated multiresolution system for low-bandwidth video communication. In: Proceedings of Human Vision and Electronic Imaging. SPIE, Bellingham (1998)

    Google Scholar 

  10. Majaranta, P., Raiha, K.J.: Twenty years of eye typing: systems and design issues. In: Eye Tracking Research and Applications (ETRA) Symposium. ACM, New Orleans (2002)

    Google Scholar 

  11. Shell, J.S., Selker, T., Vertegaal, R.: Interacting with groups of computers. Communications of the ACM 46, 40–46 (2003)

    Article  Google Scholar 

  12. Patent US20000568196 Bell Cynthia s (us) Microdisplay with eye gaze detection

    Google Scholar 

  13. Gibbens, R.J., Hunt, P.J.: Effective bandwidths for the multi-type UAS channel. Queueing Systems: Theory and Applications 9(1-2), 17–28 (1991)

    Article  MATH  Google Scholar 

  14. Sidi, M., Liu, W.Z., Cidon, L., Gopal, I.: Congestion control through input rate regulation. In: Proc. GLOBECOM 1989, Dallas, TX, pp. 1764–1768 (1989)

    Google Scholar 

  15. Errin, W.F., Reeves, D.S.: Bandwidth provisioning and pricing for networks with multiple classes of service. Computer Networks: The International Journal of Computer and Telecommunications Networking 46(1), 41–52 (2004)

    Article  Google Scholar 

  16. Patent US20040978903 Zimmerman Ofer (il); Stanwood Kenneth l (us); Bourlas Yair (us). Method and apparatus for bandwidth request/grant protocols in a wireless communication system

    Google Scholar 

  17. Weiman, C.F.R.: Video Compression via Log Polar Mapping. In: SPIE Proceedings: Real-Time Image Processing II, vol. 1295, pp. 266–277 (1990)

    Google Scholar 

  18. Patent CA20032494956 Kandhadai Ananthapadmanabhan a (us); Manjunath Sharath (in); Bandwidth-adaptive quantization

    Google Scholar 

  19. Patent WO2003EP10523 Riedel Michael (de); Neumann Roland (de) System and method for lossless reduction of bandwidth of a data stream transmitted via a digital multimedia link.

    Google Scholar 

  20. Patent EP20030767118 Turner r Brough (us); Bruemmer Kevin j (us); Matatia Michael(us) methods and apparatus for network signal aggregation and bandwidth reduction.

    Google Scholar 

  21. Web site, www.eyetechds.com

  22. Roberts, L.G.: Machine perception of three-dimensional solids. In: Tippett, J.P. (ed.) Optical and Electro-optical Information Processing. MIT Press, Cambridge (1965)

    Google Scholar 

  23. Grimson, W.E.L.: The combinatorics of heuristic search termination for object recognition in cluttered environment. IEEE Trans. Patt. Anal. Mach. Intell (1991)

    Google Scholar 

  24. Gaston, P.C., Lozano-Perez, T.: Tactile recognition and localization using object models: The case of polyhedral on a plane. IEEE Transactions on Pattern Analysis and Machine Intelligence 13(9), 920–935 (1984)

    Google Scholar 

  25. Lowe, D.G.: The viewpoint consistency constraint. Int. J. Comput. Vision 1(1), 57–72 (1987)

    Article  MathSciNet  Google Scholar 

  26. Fan, T.J., Medioni, G., Nevatia, R.: Recognizing 3-D objects using surface descriptions. IEEE Trans. Patt. Anal. Mach. Intell. 11, 1140–1157 (1989)

    Article  Google Scholar 

  27. Ballard, D.H.: Generalizing the Hough transform to detect arbitrary shapes. Pattern Recognition 13, 111–122 (1981)

    Article  MATH  Google Scholar 

  28. Grimson, W.E.L., Huttenlocher, D.P.: On the sensitivity of the Hough transform for object recognition. IEEE Trans. Patt. Anal. Mach. Intell. 13(9), 920–935 (1990)

    Article  Google Scholar 

  29. Silberberg, T.M., Davis, L.S., Harwood, D.A.: An iterative Hough procedure for three-dimensional object recognition. Pattern Recognition 17(6), 621–629 (1984)

    Article  Google Scholar 

  30. Ullman, S., Basri, R.: Recognition by linear combinations of models. IEEE Trans. Patt. Anal. Mach. Intell. 13, 992–1006 (1991)

    Article  Google Scholar 

  31. Chen, J.L., Stockman, G.C., Rao, K.: Recovering and tracking pose of curved 3D objects from 2D images. In: Proc. of IEEE Comput. Vis. Patt. Rec., New York (1993)

    Google Scholar 

  32. Forsyth, D., et al.: Invariant descriptors for 3-D object recognition and pose. IEEE Trans. Patt. Anal. Mach. Intell. PAMI 13, 917–991 (1991)

    Google Scholar 

  33. Web site, http://www.chiariglione.org/MPEG/standards/mpeg-7/mpeg-7.htm

  34. Luo, J., et al.: Pictures are not taken in a vacuum. IEEE Signal Processing Magazine (March 2006)

    Google Scholar 

  35. Haar, A.: Zur Theorie der orthogonalen Funktionensysteme. Math. Ann. 69, 331–371 (1910)

    Article  MathSciNet  MATH  Google Scholar 

  36. Viola, P., Jones, M.: Rapid Object detection using a Boosted Cascade of Simple Features. In: Proceedings of the 2001 IEEE Computer Society Conference on Computer Vision and Pattern Recognition, Kauai, Hawaii, December 8-14, vol. 1, pp. 511–518. IEEE Computer Society Press, New York (2001)

    Google Scholar 

  37. Lipton, A.J., Fujiyoshi, H., Patil, R.S.: Moving target classification and tracking from real-time video. In: Proc. IEEE Workshop Applications of Computer Vision (1998)

    Google Scholar 

  38. Tabachneck-Schijf, H.J.M., Leonardo, A.M., Simon, H.A.: CaMeRa: A computational model of multiple representations. Cognitive Science 21, 305–350 (1997)

    Article  Google Scholar 

  39. Solso, R.L.: Cognition and the Visual Arts. MIT Press, Cambridge (1993)

    Google Scholar 

  40. Roy, D.: Learning from Sights and Sounds: A Computational Model. Ph.D. in Media Arts and Sciences, MIT (pdf) (1999)

    Google Scholar 

  41. Doctorow, E.L.: Loon Lake. Random House, New York (1980)

    Google Scholar 

  42. Isherwood, C.: Goodbye to Berlin. Signet (1952)

    Google Scholar 

  43. Updike, J.: The Rabbit is Rich. Ballantine Books (1996)

    Google Scholar 

  44. FBI Facial Identification Catalog (November 1988)

    Google Scholar 

  45. Web site: (2007), http://en.wikipedia.org/wiki/Spline_mathematics

  46. Kaufer, D., Ishizaki, S., Butler, B., Collins, J.: The Power of Words: Unveiling the Speaker and Writer’s Hidden Craft. Lawrence Erlbaum, Mahwah (2004)

    Google Scholar 

  47. Grambs, D.: The Describer’s Dictionary. W W Norton & Co Inc (1995)

    Google Scholar 

  48. Archambault, A., Corbeil, J.-C.: The Macmillan Visual Dictionary. Macmillan, Basingstoke (1995)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2008 Springer-Verlag Berlin Heidelberg

About this chapter

Cite this chapter

Cai, Y., Milcent, G., Marian, L. (2008). Visual Digest Networks. In: Cai, Y. (eds) Digital Human Modeling. Lecture Notes in Computer Science(), vol 4650. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-540-89430-8_3

Download citation

  • DOI: https://doi.org/10.1007/978-3-540-89430-8_3

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-540-89429-2

  • Online ISBN: 978-3-540-89430-8

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics