The goal of the bachelor thesis is to implement a framework for motion capture of movement during music performance. The musician will be playing an instrument (eg. guitar) and the movement of his arms, hands and fingers is going to be captured. For the capturing, the inertial-based motion capture suit and gloves will be used.
Additionally, the whole performance will be captured by RGBD camera and animated pointcloud will be generated. This can be used for skeleton refitting and recalibration. The scanned pointcloud can be used for generation of a mesh model of the musician.
There is still limited existing research that relates directly to the area of guitar performance motion capturing,
however related data collection is required for analysis.
First article (2015) describes a method for the extraction of instrumental controls during guitar
performances based on the analysis of multimodal data consisting of a combination of motion
capture, audio analysis and musical score. High speed video cameras based on marker
identification are used to track the position of finger bones and articulations and audio is
recorded with a transducer measuring vibration on the guitar body. The extracted parameters are
divided into left hand controls, i.e. fingering (which string and fret is pressed with a left hand
finger) and right hand controls, i.e. the plucked string, the plucking finger and the characteristics
of the pluck (position, velocity and angles with respect to the string). Note onsets are detected via
audio analysis, the pitch is extracted from the score and distances are computed from 3D
Euclidean Geometry.
The two most relevant contributions of the method are (1) the combination of multimodal data, (2)
interpolation of marker trajectories and the use of a plucking window makes marker identification
more robust and boosts the estimation rates.
The same method for motion capture as in the first article was also used in the dissertation by
Jonathan Norton (2008). Several different approaches, mainly using data acquisition gloves,
have been attempted, but in the case of right-hand classical guitar articulations, these gloves
proved to be either too bulky, did not have the fingertips exposed so the guitarist could pluck the
strings, or did not have enough resolution or a high enough sampling rate to capture fine finger
movement. However, since the dissertation has been written, data gloves technologies have
been developing so that they would be potentially able to remove the limitations encountered
during this research.
The final approach in the dissertation thesis was based on the relative marker movement of each
marker. These relative movement calculations allowed to filter out the compound marker
movements (in much the same way as it would be done with joint angle analysis). Through the
use of the multi-dimensional data reduction technique of singular value decomposition (SVD),
each articulation was reduced to its fundamental shape components, in this case frames and
markers. Using autocorrelation and peak detection, only those components that exhibited a
certain level of periodicity were picked. Finally, to aid in the threshold investigation and
dimensionality data reduction, the concept of regions of interest (ROI) was introduced (where to
begin looking for marker activity for each articulation).
Complicated fingering forms of playing guitar tend to cause various occlusion problems. This was
the main area of focus in the article on photo-reflector technique (2004). The proposed system
obtains guitar fingering based on an optical motion capture technique that employs
photo-reflectors embedded in the guitar fingerboard. The photo-reflectors consist of full-color
LED's and phototransistors. The positions of the fingers are detected by the phototransistors
measuring the amount of light that is emitted by LED’s and reflected by fingers.
As well as the classical guitar, the electric guitar has also been considered and used in several
experiments, e.g. in the article on gesture detection (2014). The system described here is
oriented on generating sound (gesture-based effect control system) from the player’s gestures.
The gestures themselves are captured by analyzing the Doppler shift of the original signal
transmitted from the transmitter attached on the player’s picking hand to the magnetic pickup
installed on the electric guitar, and the processor converts this Doppler shift value into a delay
time. By using this system, players can control the effector by using their own guitar and
earphone.
Now the follow-up question would be - Is it possible to utilise gesture control to play a virtual
instrument? This question has been examined while developing the «“Phantom Axe”: A Motion
Capture Air Guitar» (2016). The technology used for motion capture was Microsoft Kinect.
Although the Leap Motion Controller is much more accurate for tracking hand movements and
finger gestures, it is only able to track arms and hands, and as a result would not be able to track
the motion of a person playing the guitar (left arm outstretched and right arm “strumming” near
the hip). However, even Microsoft Kinect has its limitations, e.g. the sensors inability to track
finger movements (only the tip of the hand and the wrist) which leads to the fact that the user will
not be able to finger actual guitar chords.
Similar project was developed by the Helsinki University of Technology (2005). This project
utilised a simple webcam to track orange gloves worn by the user, with sound being generated
as the right hand of the user passes through the centre-line of the imaginary guitar.
Another approach, concerning the aspects indicated by the term expressive intention and which
refer to the communication of moods and feelings in performer-computer interaction during
music performance, is presented in the study of a multilevel mapping strategy (2005). An
application based on the model described in the study is developed using the eMotion SMART
motion capture system and the Eyesweb software.
The University of Oslo has further developed the idea of music recording using optical motion
tracking system in the article «Hidden melody in music playing motion» (2016). The sound is
recorded using optical marker-based motion tracking (high-speed infrared) cameras. Due to high
spatial precision and high sampling rate, also the local acoustic vibrations can be recorded within
the motion data, which can be transformed to actual sound radiating from the acoustic
instrument.
Concerning the accuracy of the motion being captured and thus the sound being generated,
there is another article from the University of Oslo (2011), focused on how micromovements
may be used in an interactive dance/music performance. It describes micromovement data
turning into sound through, i.e., sonification of the data.
Speaking of an accurate motion capture, there is also a project called «SoundSaber - A Motion
Capture Instrument» (2011) dealing with a question - How high-fidelity motion capture
equipment can be used for prototyping musical instruments. It presents low-cost implementation
of a motion capture instrument using a high-end motion capture system from Qualisys (the initial
8-camera OptiTrack system was replaced).
Another technologie for motion capturing and gestural control of music was used for the project
of the University of California (2003). The project involves the development of, and
experimentation with, software to receive data from a Vicon motion capture system, and to
translate and map that data into data for the control of music and other media such as lighting.
Latest research on capturing movement of hands was conducted by Siggraph Asia (2017). As the title reveals, the project focused on capturing and replicating coordinated movement of hands and body together. This was achieved by using a model called MANO (hand Model with Articulated and Non-rigid defOrmations) which is learned from around 1000 high-resolution 3D scans of hands of 31 subjects in a wide variety of hand poses. The model is realistic, low-dimensional, captures non-rigid shape changes with pose, is compatible with standard graphics packages, and can fit any human hand.
Another project that took place at Siggraph Asia (2017) as well, tried to accomplish the similar task, but decided to do the capturing in real-time. They present a new algorithm for real-time hand tracking on commodity depth-sensing devices. Their method does not require a user-specific calibration session, but rather learns the geometry as the user performs live in front of the camera.
References
[1] Virtual Air Guitar Company Oy, "The Virtual Air Guitar," 2005. [Online]. Available: http://airguitar.tml.hut.fi/whatis.html. [Accessed 25th November 2017].
[2] N. Aoki, S. Tanahashi, E. Kishimoto, S. Yasuda and M. Iwakoshi , "Capturing Guitar Fingering by Photo-Reflector Technique," Sapporo, Japan, 2004.
[3] C. Dobrian and F. Bevilacqua, "Gestural Control of Music Using the Vicon 8 Motion Capture System," University of California, Irvine, 2003.
[4] D. Fenza, L. Mion, S. Canazza and A. Roda, "Physical movement and musical gestures: a multilevel mappingstrategy," in Proceedings of Sound and MusicComputing Conference, Salerno, 2005.
[5] D. Findon, "Phantom Axe: A Motion Capture Air Guitar," School of Computer Science and Informatics, Cardiff University, September 2016.
[6] A. R. Jensenius and K. A. V. Bjerkestrand, "Exploring micromovements with motion capture and sonification," 2011.
[7] J. Norton, "Motion capture to build a foundation for a computer-controlled instrument by study of classical guitar performance," Stanford University, 2008.
[8] K. Nymoen, S. A. Skogstad and A. R. Jensenius, "SoundSaber - A Motion Capture Instrument," in Proceedings of the International Conference on New Interfaces for Musical Expression, Oslo, Norway, 2011.
[9] A. Perez-Carrillo, J. Arcos and M. Wanderley, "Estimation of Guitar Fingering and Plucking Controls Based on Multimodal Analysis of Motion, Audio and Musical Score," vol. 9617, 2016.
[10] J. Romero, D. Tzionas and M. J. Black, "Embodied Hands: Modeling and Capturing Hands and Bodies Together," ACM Transactions on Graphics, (Proc. SIGGRAPH Asia), vol. 36, no. 6, 2017.
[11] M. Song, "Hidden melody in music playing motion: Music recording using optical motion tracking system," in Proceedings of the 22nd International Congress on Acoustics, Buenos Aires, 2016.
[12] S. Suh, J. Lee and W. S. Yeo, "A Gesture Detection with Guitar Pickup and Earphone," in Proceedings of The International Conference on New Interfaces for Musical Expression, Goldsmiths, University of London, 2014.
[13] A. Tkach, A. Tagliasacchi, E. Remelli, M. Pauly and A. Fitzgibbon, "Online Generative Model Personalization for Hand Tracking," ACM Transactions on Graphics (Proceedings of SIGGRAPH Asia), vol. 36, no. 6, 2017.
Used Technologies
Synertial Motion Capture Gloves
Stand-alone, bundled with a suit, or multiple pairs for use with optical rigs
Self-calibrating sensors to adapt to magnetic environments