Abstract
Video plays a great important role in online apparel sales, which is a vital tool for publicity and to provide consumers with space of imagination. However, as the apparel market rapidly updates in large amounts every day, creating videos for fast increasing clothes can be challenging and labor-consuming. Considering this, we present ApVideor, a music-driven video generation system customized for displaying clothes. This system consists of two main modules: music recommendation module and audio-visual synthesis module. The former assists users in searching background music that matches the apparel style, while the latter combines the audio and visuals into a video by music-driven approaches. Our user study suggests that this system makes the video creation process significantly easier and faster than manual creation. Meanwhile, the viewer test suggests that apparel-displaying videos created using our system are of comparable quality to those created manually by people who have worked with video editing.
Similar content being viewed by others
References
Alpert JI, Alpert MI (1990) Music influences on mood and purchase intentions. Psychol Market 7(2):109–133
Bu J, Tan S, Chen C, Wang C, Wu H, Zhang L, He X (2010) Music recommendation by unified hypergraph: combining social media information and music content. In: Proceedings of the 18th ACM international conference on Multimedia. ACM, pp 391–400
Buhler J, Neumeyer D, Deemer R (2010) Hearing the movies: music and sound in film history. Oxford University Press, New York
Caine K (2016) Local standards for sample size at chi. In: Proceedings of the 2016 CHI conference on human factors in computing systems. ACM, pp 981–992
Chao J, Wang H, Zhou W, Zhang W, Yu Y (2011) Tunesensor: a semantic-driven music recommendation service for digital photo albums. In: Proceedings of the 10th International Semantic Web Conference. ISWC2011
Chen HC, Chen AL (2001) A music recommendation system based on music data grouping and user interests. In: Proceedings of the tenth international conference on Information and knowledge management. ACM, pp 231–238
Costa YM, Oliveira LS, Silla CN Jr (2017) An evaluation of convolutional neural networks for music classification using spectrograms. Appl Soft Comput 52:28–38
Defferrard M, Benzi K, Vandergheynst P, Bresson X (2016) Fma: a dataset for music analysis. arXiv:1612.01840
Dias R, Fonseca MJ (2013) Improving music recommendation in session-based collaborative filtering by using temporal context. In: 2013 IEEE 25Th international conference on tools with artificial intelligence. IEEE, pp 783–788
Goto M (2006) A chorus section detection method for musical audio signals and its application to a music listening station. IEEE Trans Audio Speech Lang Process 14 (5):1783–1794
Harte C, Sandler M, Gasser M (2006) Detecting harmonic change in musical audio. In: Proceedings of the 1st ACM workshop on Audio and music computing multimedia. ACM, pp 21–26
Hua XS, Lu L, Zhang H (2004) Automatic music video generation based on temporal pattern analysis. In: Proceedings of the 12th annual ACM international conference on Multimedia. ACM, pp 472–475
Hua XS, Lu L, Zhang H (2004) Optimization-based automated home video editing system. IEEE Trans Circ Syst Video Technol 14(5):572–583
Liao Z, Yu Y, Gong B, Cheng L (2015) Audeosynth: music-driven video montage. ACM Trans Graph (TOG) 34(4):68
Lin JC, Wei WL, Wang HM (2016) Automatic music video generation based on emotion-oriented pseudo song prediction and matching. In: Proceedings of the 2016 ACM on Multimedia Conference. ACM, pp 372–376
Lin JC, Wei WL, Wang HM (2016) Demv-matchmaker: emotional temporal course representation and deep similarity matching for automatic music video generation. In: 2016 IEEE international conference on Acoustics, speech and signal processing (ICASSP). IEEE, pp 2772–2776
Murch W (2001) In the blink of an eye: A perspective on film editing. Silman-James Press
Nam J, Tewfik AH (2005) Detection of gradual transitions in video sequences using b-spline interpolation. IEEE Trans Multimed 7(4):667–679
Park CW, Young SM (1986) Consumer response to television commercials: The impact of involvement and background music on brand attitude formation. J Market Res 23(1):11–24
Simonyan K, Zisserman A (2014) Very deep convolutional networks for large-scale image recognition. arXiv:1409.1556
Strähle J, Hohls R (2018) In-Store Music in Fashion Stores. Springer, Singapore, pp 71–92. https://doi.org/10.1007/978-981-10-5637-6_5
Wang J, Chng E, Xu C, Lu H, Tian Q (2007) Generation of personalized music sports video using multimodal cues. IEEE Trans Multimed 9(3):576–588
Wang JC, Yang YH, Jhuo IH, Lin YY, Wang HM, et al. (2012) The acousticvisual emotion guassians model for automatic generation of music video. In: Proceedings of the 20th ACM international conference on Multimedia. ACM, pp 1379–1380
Wang X, Wang Y (2014) Improving content-based and hybrid music recommendation using deep learning. In: Proceedings of the 22nd ACM international conference on Multimedia. ACM, pp 627–636
Wu X, Qiao Y, Wang X, Tang X (2012) Bridging music and image: a preliminary study with multiple ranking cca learning. In: Proceedings of ACM Multimedia
Yapriady B, Uitdenbogerd AL (2005) Combining demographic data with collaborative filtering for automatic music recommendation. In: International conference on knowledge-based and intelligent information and engineering systems. Springer, pp 201–207
Zhang K, Zhang H, Li S, Yang C, Sun L (2018) The pmemo dataset for music emotion recognition. In: Proceedings of the 2018 ACM on International Conference on Multimedia Retrieval. ACM, pp 135–142
Acknowledgements
This project is supported by the Key Project of National Science Foundation of Zhejiang Province (No. LZ19F020002), the National Natural Science Foundation of China (No. 61672451), Provincial Key Research and Development Plan of Zhejiang Province (No. 2019C03137). And the authors would like to thank the Key Laboratory of Design Intelligence and Digital Creativity of Zhejiang Province, Alibaba-Zhejiang University Joint Institute of Frontier Technologies and State Key Lab of CAD&CG.
Author information
Authors and Affiliations
Corresponding author
Additional information
Publisher’s note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
About this article
Cite this article
Zhang, H., Zhang, K., Cao, Y. et al. A music-driven system for generating apparel display video. Multimed Tools Appl 79, 5649–5670 (2020). https://doi.org/10.1007/s11042-019-08435-x
Received:
Revised:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s11042-019-08435-x