ABSTRACT
A music video (MV) is a videotaped performance of a recorded popular song, usually accompanied by dancing and visual images. In this paper, we outline the design of a generative music video system, which automatically generates an audio-video mashup for a given target audio track. The system performs segmentation for the given target song based on beat detection. Next, according to audio similarity analysis and color heuristic selection methods, we obtain generated video segments. Then, these video segments are truncated to match the length of audio segments and are concatenated as the final music video. An evaluation of our system has shown that users are receptive to this novel presentation of music videos and are interested in future developments.
- Jonathan Foote, Matthew Cooper, and Andreas Girgensohn. 2002. Creating music videos using automatic media analysis. In Proceedings of the ACM Multimedia (MM'02), 553--560. Google ScholarDigital Library
- Xian-Sheng Hua and Hong Jiang Zhang. 2004. Automatic music video generation based on temporal pattern analysis. In Proceedings of the ACM Multimedia, (MM'04), 472--475. Google ScholarDigital Library
- Jong Chul Yoon, In-Kwon Lee, Siwoo Byun. 2009. Automated music video generation using multi-level feature-based segmentation. Multimedia Tools and Applications. 41, 2, (January 2009), 197--214.Google Scholar
- Rui Cai, Lei Zhang, Feng Jing, Wei Lai, and Wei-Ying Ma. 2007. Automated music video generation using web image resource. In Proceedings of the IEEE International Conference on Acoustics, Speech and Signal Processing, (ICASSP'07), 737--740. Google ScholarCross Ref
- Tomoyasu Nakano, Sora Murofushi, Masataka Goto and Shigeo Morishima. 2011. Dancereproducer: An automatic mashup music video generation system by reusing dance video clips on the web. In Proceedings of Sound and Music Computing Conference, (SMC'11), 183--189.Google Scholar
- Scott D. Lipscomb. 1997. Perceptual measures of visual and auditory cues in film music. The Journal of the Acoustical Society of America. 101, 5, (June 1997), 3190.Google ScholarCross Ref
- Yu Fei Ma, Lie Lu, Hong Jiang Zhang and Mingjing Li. 2002. An attention model for video summarization. In Proceedings of ACM Multimedia, (MM'02), 533--542.Google ScholarDigital Library
- Frederic Patin. 2003. Accessible Online Tutorial. Retrieved March 22, 2016, from http://www.flipcode.com/misc/OnsetDetectionAlgorithms.pdfGoogle Scholar
- Michael I. Mandel and Daniel P.W. Ellis. 2005. Song-level features and support vector machines for music classification. In Proceedings of International. Symposium. Music Information Retrieval, (ISMIR'05), 594--599.Google Scholar
- Alan V. Oppenheim. 1969. Speech analysis-synthesis system based on homomorphic filtering. Journal of the Acoustical Society of America, 45, 2, (February 1969), 458--465.Google ScholarCross Ref
- Beth Logan. 2000. Mel frequency cepstral coefficients for music modeling. In Proceedings of International Symposium on Music Information Retrieval, (ISMIR'00).Google Scholar
- Shlomo Dubnov, G'erard Assayagm, and Arshia Cont. 2011. Audio oracle analysis of musical information rate. In Proceedings of IEEE International Conference on Semantic Computing, (ICSC'11), 567--571. Google ScholarDigital Library
- Sangoh Jeong. 2001. Histogram-Based Color Image Retrieval. Accessible Online Report, Retrieved March 22, 2016, from https://ece.uwaterloo.ca/~nnikvand/Coderep/ColorHist/Histogram-Based%20Color%20Image%20Retrieval.pdfGoogle Scholar
- ffmpeg. FFmpeg website, Retrieved July 27, 2015 from https://www.ffmpeg.org/Google Scholar
- Jean, Julien Aucouturier and Boris, Defreville. 2007. Sounds like a park: a computational technique to recognize soundscapes holistically, without source identification. In the Proceedings of International Congress on Acoustics, (ICA'07).Google Scholar
- Beads. Open source audio processing library. Retrieved August 27, 2015 from http://www.beadsproject.net/Google Scholar
Recommendations
The ColorDex DJ system: a new interface for live music mixing
NIME '07: Proceedings of the 7th international conference on New interfaces for musical expressionThis paper describes the design and implementation of a new interface prototype for live music mixing. The ColorDex system employs a completely new operational metaphor which allows the mix DJ to prepare up to six tracks at once, and perform mixes ...
Musical extrapolation of speech with auto-DJ
MM '07: Proceedings of the 15th ACM international conference on MultimediaIn recent years, personalized mobile phone ringtones have been in growing demand. This paper describes auto-DJ, which uses the phone owners' voices in software DJ's performances for creating their own personalized ringtones, with a focus on its ...
Comments