Abstract
In this chapter, we combine adaptive sampling in conjunction with video analogies (VA) to correct the audio stream in the karaoke environment \(\kappa= \left \{\kappa (t) :\ \kappa (t) = \left (U(t),\ K(t)\right ),\ t \in \left ({t}_{s},\ {t}_{e}\right )\right \}\) where t s and t e are start time and end time respectively, U(t) is the user multimedia data. We employ multiple streams from the karaoke data \(K(t) = \left ({K}_{V }(t),\ {K}_{M}(t),\ {K}_{S}(t)\right )\), where K V (t), K M (t) and K S (t) are the video, musical accompaniment and original singer’s rendition respectively along with the user multimedia data \(U(t) = \left ({U}_{A}(t),{U}_{V }(t)\right )\) where U V (t) is the user video captured with a camera and U A (t) is the user’s rendition of the song. We analyze the audio and video streaming features \(\Psi (\kappa ) = \left \{\Psi (U(t),\ K(t))\right \} = \left \{\Psi (U(t)),\ \Psi (K(t))\right \} = \left \{{\Psi }_{U}(t),\ {\Psi }_{K}(t)\right \}\), to produce the corrected singing, namely output U ′(t), which is made as close as possible to the original singer’s rendition. Note that Ψ represents any kind of feature processing.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Similar content being viewed by others
References
Marc Davis. Editing out video editing. IEEE Multimedia, pages 54{64, Apr.-Jun. 2003.
Randy Goldberg and Lance Riek. A Practical Handbook of Speech Coders. CRC Press, Floria U.S.A., 2000.
Jonathan Harrington and Steve Cassidy. Techniques in Speech Acoustics. Kluwer Academic Press, Dordrecht, The Netherlands, 1999.
Mohan S. Kankanhalli, Jun Wang, and Ramesh Jain. Experiential sampling in multimedia systems. IEEE Transactions on Multimedia, 8(5):937–946, Sep. 2006.
Hirokazu Kato. Karaoke apparatus selectively providing harmony voice to duet singing voices. U.S. Patent 6121531, Sep. 2000.
David Kumar and Subutai Ahmad. Method and apparatus for providing interactive karaoke entertainment. U.S. Patent 6692259, Dec. 2002.
Shuichi Matsumoto. Karaoke apparatus converting gender of singing voice to match octave of song. U.S. Patent 5889223, Mar. 1998.
Kenji Muraki and Katsuyoshi Fujii. Karaoke sound processor for automatically adjusting the pitch of the accompaniment signal. U.S. Patent 5477003, Dec. 1995.
Milan Sonka, Vaclav Hlavac, and Roger Boyle. Image Processing, Analysis, and Machine Vision. PWS Publishing, 1998.
Xiaou Tang, Xinbo Gao, Jianzhuang Liu, and Hongjiang Zhang. A spatial-temporal approach for video caption detection and recognition. IEEE Transactions on Neural Networks, 13(4):961–971, Jul. 2002.
Xiaou Tang, Bo Luo, Xinbo Gao, Edwige Pissaloux, Jianzhuang Liu, and Hongjiang Zhang. Video text extraction using temporal feature vectors. InProc. of IEEE ICME 2002, pages 85–88, Lausanne, Switzerland, Aug. 2002.
Ye Wang, Min-Yen Kan, Tin-Lay Nwe, Arun Shenoy, and Jun Yin. Lyrically: Automatic synchronization of acoustic musical signals and textual lyrics. InProc. of ACM Multimedia 2004, pages 212 - 219, New York, USA, Oct. 2004.
Wei-Qi Yan and Mohan S Kankanhalli. Detection and removal of lighting and shaking artifacts in home videos. InProc. of ACM Multimedia 2002, pages 107–116, Juan Les Pins, France, Dec. 2002.
Wei-Qi Yan, Jun Wang, and Mohan S. Kankanhalli. Analogies based video editing. ACM Multimedia Systems, 11(1):3–18, 2005.
HongJiang Zhang, Atreyi Kankanhalli, and Stephen W. Smoliar. Automatic partitioning of full-motion video. ACM/Springer Multimedia Systems, 1(1):10–28, 1993.
Yi Zhang and Tat-Seng Chua. Detection of text captions in compressed domain video. In Proc. of ACM Multimedia 2000, pages 201–204, Marina Del Rey, CA USA, Aug. 2000.
Yong-Wei Zhu, Mohan S Kankanhalli, and Chang-Sheng Xu. Music scale modeling for melody matching. In Proc. of ACM Multimedia 2003, pages 359–362, Berkeley, U.S., Nov. 2003.
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2009 Springer Science+Business Media, LLC
About this chapter
Cite this chapter
Yan, WQ., Kankanhalli, M.S. (2009). Cross-Modal Approach for Karaoke Artifacts Correction. In: Furht, B. (eds) Handbook of Multimedia for Digital Entertainment and Arts. Springer, Boston, MA. https://doi.org/10.1007/978-0-387-89024-1_9
Download citation
DOI: https://doi.org/10.1007/978-0-387-89024-1_9
Published:
Publisher Name: Springer, Boston, MA
Print ISBN: 978-0-387-89023-4
Online ISBN: 978-0-387-89024-1
eBook Packages: Computer ScienceComputer Science (R0)