Abstract
Music object features are complex and multifaceted, ranging from short-term/low-level features to long-term/high-level features, in which the semantic gap in between has not been properly resolved yet. In this paper, we introduce a graph-based approach to organize different aspects of features in a unified way. Based on the graph, various kinds of features could be related to and associated associated with each other. However, by further investigating the graph structure, we observe that the node degree distribution asymptotically follows a power law. As a result, some hubs (i.e., high-degree nodes) will dominate most metrics; the representation of graph semantics could be degenerated. Therefore, we introduce the graph projection operator to reduce the graph complexity and “compress” the graph accordingly. The graph projection is a method of refactoring edge weights, in which only a particular set of nodes are reserved to show the intrinsic structure of graph. To demonstrate the feasibility of graph-based approach, we introduce two applications (music clustering and auto-tagging); and perform experiments. According to our experiment study, the performance of projected graph is better than that of unprojected graph.
This research was supported by the National Science Council under Contract No. NSC-101-2221-E-030-008 and No. NSC-102-2218-E-030-002.
Chapter PDF
Similar content being viewed by others
Keywords
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.
References
Fu, Z.Y., Lu, G.J., Ting, K.M., Zhang, D.S.: A survey of audio-based music classification and annotation. IEEE Transactions on Multimedia 13(2), 303–319 (2011)
Pan, J.Y., Yang, H.J., Faloutsos, C., Duygulu, P.: Automatic multimedia cross-modal correlation discovery. In: Proceedings of the ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, pp. 653–658 (2004)
Hsu, J.L., Li, Y.F.: A cross-modal method of labeling music tags. Multimedia Tools and Applications 58, 521–541 (2012)
Hsu, J.L., Huang, C.C.: Designing a graph-based framework to support a multi-modal approach for music information retrieval. Multimedia Tools and Applications, 1–27 (2014)
Benchettara, N., Kanawati, R., Rouveirol, C.: Supervised machine learning applied to link prediction in bipartite social networks. In: Proceedings of the International Conference on Advances in Social Networks Analysis and Mining (ASONAM), pp. 326–330 (2010)
Linde, Y., Buzo, A., Gray, R.: An algorithm for vector quantizer design. IEEE Transactions on Communications 28(1), 84–95 (1980)
Ertoz, L., Steinbach, M., Kumar, V.: A new shared nearest neighbor clustering algorithm and its applications. In: Proceedings of Workshop on Clustering High Dimensional Data and its Applications at 2nd SIAM International Conference on Data Mining, pp. 105–115. SIAM (2002)
Ellis, D.: Classifying music audio with timbral and chroma features. In: Proceedings of the International Conference on Music Information Retrieval (ISMIR), pp. 339–340 (2007)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2014 Springer International Publishing Switzerland
About this paper
Cite this paper
Hsu, JL., Ho, CY. (2014). Designing a Multi-modal Association Graph for Music Objects. In: Nah, F.FH. (eds) HCI in Business. HCIB 2014. Lecture Notes in Computer Science, vol 8527. Springer, Cham. https://doi.org/10.1007/978-3-319-07293-7_69
Download citation
DOI: https://doi.org/10.1007/978-3-319-07293-7_69
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-07292-0
Online ISBN: 978-3-319-07293-7
eBook Packages: Computer ScienceComputer Science (R0)