Abstract
In recent years, VR devices such as PlayStation VR, HTC Vive, and Oculus Rift have become widespread, making it possible to easily experience Virtual Reality. There are contents that can get a high immersive feeling by synchronizing the motion of the user with the CG and music in the virtual space. Research on automatic generation of musical instrument performance animations using MIDI data, which is a type of music data, has been conducted. However, there is a problem that the motion and the sound during the performance are shifted, giving the viewer a sense of incongruity.
In this research, we propose a generation system of brass band animation synchronized with the motion of conductor’s hand using VR devices. The user uses the VR controller as a baton. The user can control music and animation by performing the action of the conductor. In our system, the motion of the conductor’s hand is acquired, and the motion speed and amount of motion are calculated. After that, animation and music are controlled using the acquired data. Specifically, the point at which the direction of the acquired velocity vector changes is set as the beat start point. Then, the music speed and animation are controlled using the estimated tempo. In addition, the music volume is changed using the calculated amount of motion. Furthermore, by generating CG animation in the virtual space, it gives the user a sense of immersion. As an experiment, 12 people used this system and verified the usefulness of the method.
You have full access to this open access chapter, Download conference paper PDF
Similar content being viewed by others
Keywords
1 Introduction
The way in which the conductor gives instructions to the performer is not only by shaking the conductor but also by gaze and gestures. The conductor controls the ensemble by using various expressions to control the performer. In general, it is not easy to become a conductor, and requires a vast and wide knowledge and ability to music. In addition, it is very difficult to actually experience the conductor because instruments and players must be prepared.
In this research, we focus on the motion of the conductor’s baton and develop a system that allows anyone to easily experience the conductor. In addition, we focused on two points: the volume of the music that changes depending on the magnitude of the motion of the baton, and the speed of the music and the speed of the movement that increases the speed of the motion of the baton. In order to realize this system, the magnitude and speed of the motion are estimated from the user’s motion of the baton. Then, the volume of the music is changed according to the estimated magnitude. Also, the speed of music and the CG character playing are changed using the estimated speed. The motion of swinging the baton of the user is obtained in real time by using the controller of HTC Vive [1], a kind of VR device. Furthermore, by displaying the performers in a virtual space using Vive’s head mounted display (HMD), it is expected that the user will be obtained immersion feeling. In addition, in estimating the speed of the movement, the tempo is corrected to reduce the gap between the conductor’s motion and the music.
2 Related Works
By synchronizing the motion of the user with the animation of the CG character, the user can enjoy it as a content in a virtual space or show it as a performance, and the field of practical application is expanding.
Goto et al. proposed a method of making a virtual dancer “Cindy” dance according to the sound of an instrument played by multiple people [2]. This technique is realized by making a virtual dancer pause by using the strength of the sound, the pitch of the sound, and the number of sounds produced simultaneously. The performance of each player is input and those performances are output as sounds. Then, the performance of one player is analyzed, and the CG dancer is displayed according to the performance. By running these processes on multiple computers distributed on a network, the computation cost is distributed.
Ishizuka et al. proposed a performance synchronization method with a multi-agent system based on the user’s performance and baton motion [3]. In this method, score tracking is used for performance synchronization between user and agent. The order of the notes played by the user is stored as a note sequence template, and the performance is tracked by searching the score for the note sequence that best matches the template. In addition, when the user inputs a performance using a MIDI controller, it becomes possible to synchronize the user’s performance with the group of agents playing other instruments.
Lim et al. proposed a performance synchronization method using the gesture of the user playing the flute [4]. In the proposed method, the performance robot HRP-2 is operated using the gesture at the end of the flute opposite to the mouth. Using the gesture of the user playing the flute, it is possible to change the tempo and start and stop the performance. The gesture of the flute is limited to moving the edge up and down. Gesture detection is realized by detecting the inclination of the flute using Hough transform line detection. The only instrument that this method supports is the flute.
3 Proposed System
In this section, we describe a method for realizing a conductor system synchronized with the user’s motion proposed in this research. In this study, speed and magnitude are estimated from user’s motion. Using the estimated speed and magnitude, we control the music and performance animation to realize a system synchronized with the user’s motion.
3.1 System Overview
Figure 1 shows the flowchart of the proposed system. In this system, the HMD is put on the head, and a VR controller is held in the right hand, and the conductor action is performed. The controller is displayed in the virtual space in the shape of a baton.
In this system, music data (MIDI) is first read using “Midi Tool Kit Pro” [5]. “Midi Tool Kit Pro” is suitable for the construction of this system because the speed and volume of the played music can be easily changed.
Then, according to the conductor action of the user, the CG player displayed on the HMD and the music being played are changed. The instruments used in this system are piano, guitar, drum, trumpet, clarinet and flute. The instrument information to be used is acquired using the information of MIDI data, and the performer is displayed. Figure 2 shows the arrangement when all players are displayed.
3.2 User Motion Estimation
In this research, we focused on the quadruple rhythm music with the largest number of music, and defined the user’s conductor actions as the actions shown in Fig. 3. The position of the controller shown in the figure is the start position of each operation. The magnitude and speed are estimated from these operations shown in the figure.
3.2.1 Motion Start Position Estimation
In order to estimate the magnitude and speed of the conductor’s motion, it is necessary to find the start point of each beat. The start point of each beat has the following features.
-
Point where the direction of the velocity vector of the controller changes from top to bottom
-
Point where the direction of the controller’s velocity vector changes from left to right
-
Point where the direction of the velocity vector of the controller changes from right to left
The velocity vector is a vector that indicates the direction and how far the controller has moved. The start point can be estimated by obtaining the velocity vector. The velocity vector is calculated using the position that can be obtained from the controller. In addition, in order to prevent erroneous estimation for small movements, a change in the velocity vector size of 0.2 mm/s or more is detected. This value was determined by conducting preliminary experiments.
3.2.2 Motion Magnitude Estimation
The magnitude of the motion is estimated using the estimated start point of the motion of the beat. The magnitude of the estimated beat motion is used to change the playback volume.
In order to estimate the magnitude of the motion, the midway point is used in addition to the start point of each beat. The midway point is the point where the direction of the velocity vector changes from bottom to top. The midway points of each beat are shown in red circles in Fig. 3.
Assuming that the distance from the start point to the midway point is L1 and the distance from the midway point to the start point of the next beat is L2, the magnitude L of the motion is obtained by the sum of L1 and L2. This is performed in “Reference motion estimation” and “User motion estimation” in the flowchart shown in Fig. 1. Then, the estimated motion magnitude is divided by the reference motion magnitude determine the music playback volume.
In “Reference motion estimation”, the conductor’s motion is performed five times before playing the music data. The average of the magnitude of each beat of the conductor’s motion is calculated, and the average is used as the reference motion.
3.2.3 Motion Speed Estimation
The speed of the motion is estimated by using the estimated start point of the beat. The estimated motion speed is used to calculate the playback speed magnification that changes the playback speed of music and animation. Eq. (1) shows the relationship between conductor’s motion time and tempo.
where conductor’s motion time is obtained by measuring the time from the start point to the start point of the next beat. The playback speed magnification is calculated by dividing the obtained tempo by the original tempo read from the MIDI data. In this system, the minimum value of the playback speed magnification is 0.1, the maximum value is 5.0, and the maximum value of the estimated tempo is 240. (The reason why the maximum value of tempo is 240 is that when the tempo is 240, the conductor’s motion time is 0.25 s, and when it is larger than that, it is not realistic.)
3.3 Optimization of Playback Speed Magnification
Using the motion magnitude and playback speed magnification obtained in Sects. 3.2.2 and 3.2.3, change the playback volume, playback speed, and animation of the music when the user’s motion reached the start point of the next beat. However, if the music playback speed is changed at the start point of the next beat, there is a possibility that a large gap may occur between the actual beat position of the music and the start point of the beat motion. (For example, if you change the conductor’s motion very slowly.)
Therefore, when the time of the previous beat has passed while conducting motion, the playback speed magnification is reduced. The amount of decrease in the reproduction speed magnification M is as shown in the following Eqs. (2) and (3).
In Eq. (2), where \( t \) is the time of the previous beat, \( v \) is the playback speed magnification of the previous beat, \( x \) is the current time, \( t\_min \) is the time at which the reproduction speed magnification becomes the minimum value, and \( def \) is the tempo obtained from the MIDI data.
Up to \( t \) sec., playback is performed using the playback speed magnification \( v \). After the elapse of \( t \) seconds, if the midway point shown in Sect. 3.2.2 has been acquired, the degree of change in the playback speed magnification is reduced using Eq. (2). On the other hand, if the midway point has not been acquired, the degree of the change in the reproduction speed magnification is reduced constantly for each frame using Eq. (3) until the midway point is acquired. Then, after the midway point is obtained, the degree of change in the reproduction speed magnification is reduced using Eq. (2).
Figure 4 shows the change in the reproduction speed magnification when \( def{\text{\;}} = {\text{\;}}120 \) and \( t{\text{\;}} = {\text{\;}}0.5 \). Figure 4(a) shows the case where the midway point has already been acquired in 0.5 s. Figure 4(b) shows the case where the elapsed point was acquired in 0.7 s.
3.4 Performance Animation Generation
In this system, the musician performs the performance animation corresponding to each instrument. The performance animation was generated by “Mixamo” [6] and “EasyMotionRecorder”.
“Mixamo” is a web service provided by Adobe that allows you to customize and animate 3D characters. Using this system, we generated animations of pianos, guitars, and drums with large movements.
For motion tracking using Vive, we used the “FinalIK” [7] asset that runs on Unity and the “EasyMotionRecorder” script [8]. “FinalIK” is an asset that allows users to freely move CG characters within Unity. It is also possible to move CG characters using VR device such as Vive used in this system. Therefore, it is possible to generate a performance animation of a CG character that actually performs a performance animation using this asset and Vive. “EasyMotionRecorder” is a script that records and plays back an animation of a CG character that has been motion-captured by the above method. The animation is recorded using this, and the animation is played back while the system is actually running.
4 Experimental Result
In order to verify the effectiveness of the proposed system, we asked 12 people, including men and women, to use this system and conducted a questionnaire on the following items. Items (1) and (2) were evaluated on a 5-point scale, and item (3) was freely answered by the subject.
-
1.
Is the music or animation changing according to the motion speed?
-
2.
Is the music or animation changing according to the magnitude of the motion?
-
3.
Are there any parts of the music that cause discomfort?
Table 1 shows the results of the questionnaire. Since both items (1) and (2) received a high rating of 4.5 or more, it is considered that the tempo and volume were correctly estimated from the user’s conductor’s motion us ing the proposed method.
In item (3), the following opinions were obtained from the subjects.
-
There is a gap between the beat position of the music and the start position of the conductor.
-
I intend to move the controller at the same speed, but I feel that the playback speed is uneven.
Even after optimizing the playback speed magnification, there is a slight gap between the music and the conductor’s action. It is thought that this problem can be improved by performing motion prediction using deep learning.
In addition, since the distance of the motion defined for each beat is different, even if the conductor’s motion is performed at the same speed, the estimated playback speed will change slightly. It is thought that this problem can be improved by normalizing the estimated tempo according to the motion of each beat with the distance.
5 Conclusion
In this paper, we proposed a method of synchronizing conductor’s motion with music and CG characters. The motion of the user was acquired using the controller of the VR device, and the magnitude and speed were estimated from the motion. The music playback volume was controlled from the estimated magnitude, and the music speed and animation were controlled from the speed. In addition, by adjusting the playback speed magnification, the gap between the beat of the music and the start point of the conductor’s motion was reduced.
From the experimental results, it was confirmed that the magnitude and speed of the motion could be estimated using the proposed method. It was also confirmed that the motion of the user was reflected in music and animation of CG characters.
As a future work, there is a study on the adjustment of the reproduction speed magnification using a mathematical stochastic model with the arm motion as learning data. In this paper, we focus on quadruple rhythm music as future developments. Furthermore, in this paper, we focused on quadruple rhythm music, but as a future perspective, we will also focus on music with other rhythms.
References
VIVE™. https://www.vive.com/
Goto, M., Muraoka, Y.: Interactive performance of music danced CG dancer. In: Proceedings on Workshop on Interactive Systems and Softwares, vol. 95, pp. 18–19 (1995)
Ishizuka, K., Nakamura, R., Gotoh, T., Tamura, N., Shimada, H.: Multiple agent simulation of joint performance with virtual orchestral and real player. ITE Tech. Rep. 37, 99–102 (2013)
Angelica, L., et al.: Musical robot co-player: real-time synchronization with a human flutist recognizing visual start and end cues. IPSJ J. 52(12), 3599–3610 (2011)
Midi Tool Kit Pro - Asset Store. https://assetstore.unity.com/packages/tools/audio/midi-tool-kit-pro-115331
Mixamo. https://www.mixamo.com/
Final IK - Asset Store. https://assetstore.unity.com/packages/tools/animation/final-ik-14290
GitHub - duo-inc/EasyMotionRecorder. https://github.com/duo-inc/EasyMotionRecorder
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2020 Springer Nature Switzerland AG
About this paper
Cite this paper
Muraki, Y., Kobayashi, K., Nishio, K., Kobori, Ki. (2020). Generation of Brass Band Animation Synchronized with the Motion of Conductor’s Hand. In: Stephanidis, C., Antona, M. (eds) HCI International 2020 - Posters. HCII 2020. Communications in Computer and Information Science, vol 1225. Springer, Cham. https://doi.org/10.1007/978-3-030-50729-9_29
Download citation
DOI: https://doi.org/10.1007/978-3-030-50729-9_29
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-50728-2
Online ISBN: 978-3-030-50729-9
eBook Packages: Computer ScienceComputer Science (R0)