Abstract
Attention, understanding and abstraction are three key elements in our visual communication that we have taken for granted. These interconnected elements constitute a Visual Digest Network. In this chapter, we investigate the conceptual design of Visual Digest Networks at three visual abstraction levels: gaze, object and word. The goal is to minimize the media footprint during visual communication while sustaining essential semantic communication. The Attentive Video Network is designed to detect the operator’s gaze and adjust the video resolution at the sensor side across the network. Our results show significant improvements in network bandwidth utilization. The Object Video Network is designed for mobile video network applications, where faces and cars are detected. The multi-resolution profiles are configured for media according to the network footprint. The video is sent across the network with multiple resolutions and metadata; controlled by the bandwidth regulator. The results show that the video can be transmitted in the low-bandwidth conditions. Finally, the Image-Word Search Network is designed for face reconstruction across the network. In this study, we assume the hidden layer between the facial features and referral expressive words contain ‘control points’ that can be articulated mathematically, visually and verbally. This experiment is a crude model of the semantic network. Nevertheless, we see the potential of the twoway mapping.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
References
Cai, Y.: How Many Pixels Do We Need to See Things? In: Sloot, P.M.A., Abramson, D., Bogdanov, A.V., Gorbachev, Y.E., Dongarra, J., Zomaya, A.Y. (eds.) ICCS 2003. LNCS, vol. 2659, pp. 1064–1073. Springer, Heidelberg (2003)
Arnham, R.: Visual Thinking. University of California Press (1969)
Allport, A.: Visual Attention. MIT Press, Cambridge (1993)
Yarbus, A.L.: Eye Movements during Perception of Complex Objects. Plenum Press, New York (1967)
Larkin, J.H., Simon, H.A.: Why a diagram is (sometimes) worth 10,000 words. Cognitive Science 11, 65–100 (1987)
Duchowski, A.T., et al.: Gaze-Contingent Displays: A Review. Cyber-Psychology and Behavior 7(6) (2004)
Kortum, P., Geisler, W.: Implementation of a foveated image coding system for image bandwidth reduction. In: SPIE Proceedings, vol. 2657, pp. 350–360 (1996)
Geisler, W.S., Perry, J.S.: Real-time foveated multiresolution system for low-bandwidth video communication. In: Proceedings of Human Vision and Electronic Imaging. SPIE, Bellingham (1998)
Majaranta, P., Raiha, K.J.: Twenty years of eye typing: systems and design issues. In: Eye Tracking Research and Applications (ETRA) Symposium. ACM, New Orleans (2002)
Shell, J.S., Selker, T., Vertegaal, R.: Interacting with groups of computers. Communications of the ACM 46, 40–46 (2003)
Patent US20000568196 Bell Cynthia s (us) Microdisplay with eye gaze detection
Gibbens, R.J., Hunt, P.J.: Effective bandwidths for the multi-type UAS channel. Queueing Systems: Theory and Applications 9(1-2), 17–28 (1991)
Sidi, M., Liu, W.Z., Cidon, L., Gopal, I.: Congestion control through input rate regulation. In: Proc. GLOBECOM 1989, Dallas, TX, pp. 1764–1768 (1989)
Errin, W.F., Reeves, D.S.: Bandwidth provisioning and pricing for networks with multiple classes of service. Computer Networks: The International Journal of Computer and Telecommunications Networking 46(1), 41–52 (2004)
Patent US20040978903 Zimmerman Ofer (il); Stanwood Kenneth l (us); Bourlas Yair (us). Method and apparatus for bandwidth request/grant protocols in a wireless communication system
Weiman, C.F.R.: Video Compression via Log Polar Mapping. In: SPIE Proceedings: Real-Time Image Processing II, vol. 1295, pp. 266–277 (1990)
Patent CA20032494956 Kandhadai Ananthapadmanabhan a (us); Manjunath Sharath (in); Bandwidth-adaptive quantization
Patent WO2003EP10523 Riedel Michael (de); Neumann Roland (de) System and method for lossless reduction of bandwidth of a data stream transmitted via a digital multimedia link.
Patent EP20030767118 Turner r Brough (us); Bruemmer Kevin j (us); Matatia Michael(us) methods and apparatus for network signal aggregation and bandwidth reduction.
Web site, www.eyetechds.com
Roberts, L.G.: Machine perception of three-dimensional solids. In: Tippett, J.P. (ed.) Optical and Electro-optical Information Processing. MIT Press, Cambridge (1965)
Grimson, W.E.L.: The combinatorics of heuristic search termination for object recognition in cluttered environment. IEEE Trans. Patt. Anal. Mach. Intell (1991)
Gaston, P.C., Lozano-Perez, T.: Tactile recognition and localization using object models: The case of polyhedral on a plane. IEEE Transactions on Pattern Analysis and Machine Intelligence 13(9), 920–935 (1984)
Lowe, D.G.: The viewpoint consistency constraint. Int. J. Comput. Vision 1(1), 57–72 (1987)
Fan, T.J., Medioni, G., Nevatia, R.: Recognizing 3-D objects using surface descriptions. IEEE Trans. Patt. Anal. Mach. Intell. 11, 1140–1157 (1989)
Ballard, D.H.: Generalizing the Hough transform to detect arbitrary shapes. Pattern Recognition 13, 111–122 (1981)
Grimson, W.E.L., Huttenlocher, D.P.: On the sensitivity of the Hough transform for object recognition. IEEE Trans. Patt. Anal. Mach. Intell. 13(9), 920–935 (1990)
Silberberg, T.M., Davis, L.S., Harwood, D.A.: An iterative Hough procedure for three-dimensional object recognition. Pattern Recognition 17(6), 621–629 (1984)
Ullman, S., Basri, R.: Recognition by linear combinations of models. IEEE Trans. Patt. Anal. Mach. Intell. 13, 992–1006 (1991)
Chen, J.L., Stockman, G.C., Rao, K.: Recovering and tracking pose of curved 3D objects from 2D images. In: Proc. of IEEE Comput. Vis. Patt. Rec., New York (1993)
Forsyth, D., et al.: Invariant descriptors for 3-D object recognition and pose. IEEE Trans. Patt. Anal. Mach. Intell. PAMI 13, 917–991 (1991)
Web site, http://www.chiariglione.org/MPEG/standards/mpeg-7/mpeg-7.htm
Luo, J., et al.: Pictures are not taken in a vacuum. IEEE Signal Processing Magazine (March 2006)
Haar, A.: Zur Theorie der orthogonalen Funktionensysteme. Math. Ann. 69, 331–371 (1910)
Viola, P., Jones, M.: Rapid Object detection using a Boosted Cascade of Simple Features. In: Proceedings of the 2001 IEEE Computer Society Conference on Computer Vision and Pattern Recognition, Kauai, Hawaii, December 8-14, vol. 1, pp. 511–518. IEEE Computer Society Press, New York (2001)
Lipton, A.J., Fujiyoshi, H., Patil, R.S.: Moving target classification and tracking from real-time video. In: Proc. IEEE Workshop Applications of Computer Vision (1998)
Tabachneck-Schijf, H.J.M., Leonardo, A.M., Simon, H.A.: CaMeRa: A computational model of multiple representations. Cognitive Science 21, 305–350 (1997)
Solso, R.L.: Cognition and the Visual Arts. MIT Press, Cambridge (1993)
Roy, D.: Learning from Sights and Sounds: A Computational Model. Ph.D. in Media Arts and Sciences, MIT (pdf) (1999)
Doctorow, E.L.: Loon Lake. Random House, New York (1980)
Isherwood, C.: Goodbye to Berlin. Signet (1952)
Updike, J.: The Rabbit is Rich. Ballantine Books (1996)
FBI Facial Identification Catalog (November 1988)
Web site: (2007), http://en.wikipedia.org/wiki/Spline_mathematics
Kaufer, D., Ishizaki, S., Butler, B., Collins, J.: The Power of Words: Unveiling the Speaker and Writer’s Hidden Craft. Lawrence Erlbaum, Mahwah (2004)
Grambs, D.: The Describer’s Dictionary. W W Norton & Co Inc (1995)
Archambault, A., Corbeil, J.-C.: The Macmillan Visual Dictionary. Macmillan, Basingstoke (1995)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2008 Springer-Verlag Berlin Heidelberg
About this chapter
Cite this chapter
Cai, Y., Milcent, G., Marian, L. (2008). Visual Digest Networks. In: Cai, Y. (eds) Digital Human Modeling. Lecture Notes in Computer Science(), vol 4650. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-540-89430-8_3
Download citation
DOI: https://doi.org/10.1007/978-3-540-89430-8_3
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-89429-2
Online ISBN: 978-3-540-89430-8
eBook Packages: Computer ScienceComputer Science (R0)