A music-driven system for generating apparel display video

Zhang, Hui; Zhang, Kejun; Cao, Yingping; Zheng, Jun; Huang, Xiaoyi; Yang, Changyuan; Sun, Lingyun

doi:10.1007/s11042-019-08435-x

A music-driven system for generating apparel display video

Published: 07 December 2019

Volume 79, pages 5649–5670, (2020)
Cite this article

Multimedia Tools and Applications Aims and scope Submit manuscript

Hui Zhang¹,
Kejun Zhang ORCID: orcid.org/0000-0002-0778-2303¹,
Yingping Cao¹,
Jun Zheng¹,
Xiaoyi Huang¹,
Changyuan Yang² &
…
Lingyun Sun³

547 Accesses
2 Citations
Explore all metrics

Abstract

Video plays a great important role in online apparel sales, which is a vital tool for publicity and to provide consumers with space of imagination. However, as the apparel market rapidly updates in large amounts every day, creating videos for fast increasing clothes can be challenging and labor-consuming. Considering this, we present ApVideor, a music-driven video generation system customized for displaying clothes. This system consists of two main modules: music recommendation module and audio-visual synthesis module. The former assists users in searching background music that matches the apparel style, while the latter combines the audio and visuals into a video by music-driven approaches. Our user study suggests that this system makes the video creation process significantly easier and faster than manual creation. Meanwhile, the viewer test suggests that apparel-displaying videos created using our system are of comparable quality to those created manually by people who have worked with video editing.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Selecting User Generated Content for Use in Media Productions

Cogeneration of Innovative Audio-visual Content: A New Challenge for Computing Art

Article 15 January 2024

Mengting Liu, Ying Zhou, … Feng Gao

Automatic Videography Generation from Audio Tracks

Notes

http://multichannelmerchant.com/mcm-outlook-2013-research-reports/
ApVideor: A Video Automaker for Apparels
https://www.djangoproject.com/
https://deryy.github.io/ApvideorDemo/page.html
https://pytorch.org/docs/stable/torchvision/models.html
https://librosa.github.io

References

Alpert JI, Alpert MI (1990) Music influences on mood and purchase intentions. Psychol Market 7(2):109–133
Article Google Scholar
Bu J, Tan S, Chen C, Wang C, Wu H, Zhang L, He X (2010) Music recommendation by unified hypergraph: combining social media information and music content. In: Proceedings of the 18th ACM international conference on Multimedia. ACM, pp 391–400
Buhler J, Neumeyer D, Deemer R (2010) Hearing the movies: music and sound in film history. Oxford University Press, New York
Caine K (2016) Local standards for sample size at chi. In: Proceedings of the 2016 CHI conference on human factors in computing systems. ACM, pp 981–992
Chao J, Wang H, Zhou W, Zhang W, Yu Y (2011) Tunesensor: a semantic-driven music recommendation service for digital photo albums. In: Proceedings of the 10th International Semantic Web Conference. ISWC2011
Chen HC, Chen AL (2001) A music recommendation system based on music data grouping and user interests. In: Proceedings of the tenth international conference on Information and knowledge management. ACM, pp 231–238
Costa YM, Oliveira LS, Silla CN Jr (2017) An evaluation of convolutional neural networks for music classification using spectrograms. Appl Soft Comput 52:28–38
Article Google Scholar
Defferrard M, Benzi K, Vandergheynst P, Bresson X (2016) Fma: a dataset for music analysis. arXiv:1612.01840
Dias R, Fonseca MJ (2013) Improving music recommendation in session-based collaborative filtering by using temporal context. In: 2013 IEEE 25Th international conference on tools with artificial intelligence. IEEE, pp 783–788
Goto M (2006) A chorus section detection method for musical audio signals and its application to a music listening station. IEEE Trans Audio Speech Lang Process 14 (5):1783–1794
Article Google Scholar
Harte C, Sandler M, Gasser M (2006) Detecting harmonic change in musical audio. In: Proceedings of the 1st ACM workshop on Audio and music computing multimedia. ACM, pp 21–26
Hua XS, Lu L, Zhang H (2004) Automatic music video generation based on temporal pattern analysis. In: Proceedings of the 12th annual ACM international conference on Multimedia. ACM, pp 472–475
Hua XS, Lu L, Zhang H (2004) Optimization-based automated home video editing system. IEEE Trans Circ Syst Video Technol 14(5):572–583
Article Google Scholar
Liao Z, Yu Y, Gong B, Cheng L (2015) Audeosynth: music-driven video montage. ACM Trans Graph (TOG) 34(4):68
Article Google Scholar
Lin JC, Wei WL, Wang HM (2016) Automatic music video generation based on emotion-oriented pseudo song prediction and matching. In: Proceedings of the 2016 ACM on Multimedia Conference. ACM, pp 372–376
Lin JC, Wei WL, Wang HM (2016) Demv-matchmaker: emotional temporal course representation and deep similarity matching for automatic music video generation. In: 2016 IEEE international conference on Acoustics, speech and signal processing (ICASSP). IEEE, pp 2772–2776
Murch W (2001) In the blink of an eye: A perspective on film editing. Silman-James Press
Nam J, Tewfik AH (2005) Detection of gradual transitions in video sequences using b-spline interpolation. IEEE Trans Multimed 7(4):667–679
Article Google Scholar
Park CW, Young SM (1986) Consumer response to television commercials: The impact of involvement and background music on brand attitude formation. J Market Res 23(1):11–24
Article MathSciNet Google Scholar
Simonyan K, Zisserman A (2014) Very deep convolutional networks for large-scale image recognition. arXiv:1409.1556
Strähle J, Hohls R (2018) In-Store Music in Fashion Stores. Springer, Singapore, pp 71–92. https://doi.org/10.1007/978-981-10-5637-6_5
Book Google Scholar
Wang J, Chng E, Xu C, Lu H, Tian Q (2007) Generation of personalized music sports video using multimodal cues. IEEE Trans Multimed 9(3):576–588
Article Google Scholar
Wang JC, Yang YH, Jhuo IH, Lin YY, Wang HM, et al. (2012) The acousticvisual emotion guassians model for automatic generation of music video. In: Proceedings of the 20th ACM international conference on Multimedia. ACM, pp 1379–1380
Wang X, Wang Y (2014) Improving content-based and hybrid music recommendation using deep learning. In: Proceedings of the 22nd ACM international conference on Multimedia. ACM, pp 627–636
Wu X, Qiao Y, Wang X, Tang X (2012) Bridging music and image: a preliminary study with multiple ranking cca learning. In: Proceedings of ACM Multimedia
Yapriady B, Uitdenbogerd AL (2005) Combining demographic data with collaborative filtering for automatic music recommendation. In: International conference on knowledge-based and intelligent information and engineering systems. Springer, pp 201–207
Zhang K, Zhang H, Li S, Yang C, Sun L (2018) The pmemo dataset for music emotion recognition. In: Proceedings of the 2018 ACM on International Conference on Multimedia Retrieval. ACM, pp 135–142

Download references

Acknowledgements

This project is supported by the Key Project of National Science Foundation of Zhejiang Province (No. LZ19F020002), the National Natural Science Foundation of China (No. 61672451), Provincial Key Research and Development Plan of Zhejiang Province (No. 2019C03137). And the authors would like to thank the Key Laboratory of Design Intelligence and Digital Creativity of Zhejiang Province, Alibaba-Zhejiang University Joint Institute of Frontier Technologies and State Key Lab of CAD&CG.

Author information

Authors and Affiliations

Laboratory of CAD, CG, Zhejiang University, Hangzhou, China
Hui Zhang, Kejun Zhang, Yingping Cao, Jun Zheng & Xiaoyi Huang
Alibaba NHCI Lab, Alibaba Group, Hangzhou, China
Changyuan Yang
International Design Institute, Zhejiang University, Hangzhou, China
Lingyun Sun

Authors

Hui Zhang
View author publications
You can also search for this author in PubMed Google Scholar
Kejun Zhang
View author publications
You can also search for this author in PubMed Google Scholar
Yingping Cao
View author publications
You can also search for this author in PubMed Google Scholar
Jun Zheng
View author publications
You can also search for this author in PubMed Google Scholar
Xiaoyi Huang
View author publications
You can also search for this author in PubMed Google Scholar
Changyuan Yang
View author publications
You can also search for this author in PubMed Google Scholar
Lingyun Sun
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Kejun Zhang.

Additional information

Publisher’s note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Zhang, H., Zhang, K., Cao, Y. et al. A music-driven system for generating apparel display video. Multimed Tools Appl 79, 5649–5670 (2020). https://doi.org/10.1007/s11042-019-08435-x

Download citation

Received: 13 March 2019
Revised: 07 August 2019
Accepted: 01 November 2019
Published: 07 December 2019
Issue Date: March 2020
DOI: https://doi.org/10.1007/s11042-019-08435-x

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

A music-driven system for generating apparel display video

Abstract

Access this article

Similar content being viewed by others

Selecting User Generated Content for Use in Media Productions

Cogeneration of Innovative Audio-visual Content: A New Challenge for Computing Art

Automatic Videography Generation from Audio Tracks

Notes

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Additional information

Publisher’s note

Rights and permissions

About this article

Cite this article

Keywords

Navigation

A music-driven system for generating apparel display video

Abstract

Access this article

Similar content being viewed by others

Selecting User Generated Content for Use in Media Productions

Cogeneration of Innovative Audio-visual Content: A New Challenge for Computing Art

Automatic Videography Generation from Audio Tracks

Notes

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Additional information

Publisher’s note

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation