The goal of the bachelor thesis is to implement a framework for motion capture of movement during music performance. The musician will be playing an instrument (eg. guitar) and the movement of his arms, hands and fingers is going to be captured. For the capturing, the inertial-based motion capture suit and gloves will be used.

Additionally, the whole performance will be captured by RGBD camera and animated pointcloud will be generated. This can be used for skeleton refitting and recalibration. The scanned pointcloud can be used for generation of a mesh model of the musician.


All of the articles referenced in the text are numbered, linked and briefly described in the sheet below:
Spreadsheet - Articles On Guitar MoCap

There is still limited existing research that relates directly to the area of guitar performance motion capturing, however related data collection is required for analysis.

Estimation of Guitar Fingering and Plucking Controls Based on Multimodal Analysis of Motion, Audio and Musical Score

First article (2015) describes a method for the extraction of instrumental controls during guitar performances based on the analysis of multimodal data consisting of a combination of motion capture, audio analysis and musical score. High speed video cameras based on marker identification are used to track the position of finger bones and articulations and audio is recorded with a transducer measuring vibration on the guitar body. The extracted parameters are divided into left hand controls, i.e. fingering (which string and fret is pressed with a left hand finger) and right hand controls, i.e. the plucked string, the plucking finger and the characteristics of the pluck (position, velocity and angles with respect to the string). Note onsets are detected via audio analysis, the pitch is extracted from the score and distances are computed from 3D Euclidean Geometry.

The two most relevant contributions of the method are (1) the combination of multimodal data, (2) interpolation of marker trajectories and the use of a plucking window makes marker identification more robust and boosts the estimation rates.


The same method for motion capture as in the first article was also used in the dissertation by Jonathan Norton (2008). Several different approaches, mainly using data acquisition gloves, have been attempted, but in the case of right-hand classical guitar articulations, these gloves proved to be either too bulky, did not have the fingertips exposed so the guitarist could pluck the strings, or did not have enough resolution or a high enough sampling rate to capture fine finger movement. However, since the dissertation has been written, data gloves technologies have been developing so that they would be potentially able to remove the limitations encountered during this research.

The final approach in the dissertation thesis was based on the relative marker movement of each marker. These relative movement calculations allowed to filter out the compound marker movements (in much the same way as it would be done with joint angle analysis). Through the use of the multi-dimensional data reduction technique of singular value decomposition (SVD), each articulation was reduced to its fundamental shape components, in this case frames and markers. Using autocorrelation and peak detection, only those components that exhibited a certain level of periodicity were picked. Finally, to aid in the threshold investigation and dimensionality data reduction, the concept of regions of interest (ROI) was introduced (where to begin looking for marker activity for each articulation).


Complicated fingering forms of playing guitar tend to cause various occlusion problems. This was the main area of focus in the article on photo-reflector technique (2004). The proposed system obtains guitar fingering based on an optical motion capture technique that employs photo-reflectors embedded in the guitar fingerboard. The photo-reflectors consist of full-color LED's and phototransistors. The positions of the fingers are detected by the phototransistors measuring the amount of light that is emitted by LED’s and reflected by fingers.

A Gesture Detection with Guitar Pickup and Earphone

As well as the classical guitar, the electric guitar has also been considered and used in several experiments, e.g. in the article on gesture detection (2014). The system described here is oriented on generating sound (gesture-based effect control system) from the player’s gestures. The gestures themselves are captured by analyzing the Doppler shift of the original signal transmitted from the transmitter attached on the player’s picking hand to the magnetic pickup installed on the electric guitar, and the processor converts this Doppler shift value into a delay time. By using this system, players can control the effector by using their own guitar and earphone.

“Phantom Axe”: A Motion Capture Air Guitar

Now the follow-up question would be - Is it possible to utilise gesture control to play a virtual instrument? This question has been examined while developing the «“Phantom Axe”: A Motion Capture Air Guitar» (2016). The technology used for motion capture was Microsoft Kinect. Although the Leap Motion Controller is much more accurate for tracking hand movements and finger gestures, it is only able to track arms and hands, and as a result would not be able to track the motion of a person playing the guitar (left arm outstretched and right arm “strumming” near the hip). However, even Microsoft Kinect has its limitations, e.g. the sensors inability to track finger movements (only the tip of the hand and the wrist) which leads to the fact that the user will not be able to finger actual guitar chords.

The Virtual Air Guitar

Similar project was developed by the Helsinki University of Technology (2005). This project utilised a simple webcam to track orange gloves worn by the user, with sound being generated as the right hand of the user passes through the centre-line of the imaginary guitar.


Another approach, concerning the aspects indicated by the term expressive intention and which refer to the communication of moods and feelings in performer-computer interaction during music performance, is presented in the study of a multilevel mapping strategy (2005). An application based on the model described in the study is developed using the eMotion SMART motion capture system and the Eyesweb software.

Hidden melody in music playing motion: Music recording using optical motion tracking system

The University of Oslo has further developed the idea of music recording using optical motion tracking system in the article «Hidden melody in music playing motion» (2016). The sound is recorded using optical marker-based motion tracking (high-speed infrared) cameras. Due to high spatial precision and high sampling rate, also the local acoustic vibrations can be recorded within the motion data, which can be transformed to actual sound radiating from the acoustic instrument.

Exploring micromovements with motion capture and sonification

Concerning the accuracy of the motion being captured and thus the sound being generated, there is another article from the University of Oslo (2011), focused on how micromovements may be used in an interactive dance/music performance. It describes micromovement data turning into sound through, i.e., sonification of the data.

SoundSaber - A Motion Capture Instrument

Speaking of an accurate motion capture, there is also a project called «SoundSaber - A Motion Capture Instrument» (2011) dealing with a question - How high-fidelity motion capture equipment can be used for prototyping musical instruments. It presents low-cost implementation of a motion capture instrument using a high-end motion capture system from Qualisys (the initial 8-camera OptiTrack system was replaced).

Gestural Control of Music Using the Vicon 8 Motion Capture System

Another technologie for motion capturing and gestural control of music was used for the project of the University of California (2003). The project involves the development of, and experimentation with, software to receive data from a Vicon motion capture system, and to translate and map that data into data for the control of music and other media such as lighting.

Embodied Hands: Modeling and Capturing Hands and Bodies Together

Latest research on capturing movement of hands was conducted by Siggraph Asia (2017). As the title reveals, the project focused on capturing and replicating coordinated movement of hands and body together. This was achieved by using a model called MANO (hand Model with Articulated and Non-rigid defOrmations) which is learned from around 1000 high-resolution 3D scans of hands of 31 subjects in a wide variety of hand poses. The model is realistic, low-dimensional, captures non-rigid shape changes with pose, is compatible with standard graphics packages, and can fit any human hand.

Online Generative Model Personalization for Hand Tracking

Another project that took place at Siggraph Asia (2017) as well, tried to accomplish the similar task, but decided to do the capturing in real-time. They present a new algorithm for real-time hand tracking on commodity depth-sensing devices. Their method does not require a user-specific calibration session, but rather learns the geometry as the user performs live in front of the camera.


[1]   Virtual Air Guitar Company Oy, "The Virtual Air Guitar," 2005. [Online]. Available: http://airguitar.tml.hut.fi/whatis.html. [Accessed 25th November 2017].
[2]   N. Aoki, S. Tanahashi, E. Kishimoto, S. Yasuda and M. Iwakoshi , "Capturing Guitar Fingering by Photo-Reflector Technique," Sapporo, Japan, 2004.
[3]   C. Dobrian and F. Bevilacqua, "Gestural Control of Music Using the Vicon 8 Motion Capture System," University of California, Irvine, 2003.
[4]   D. Fenza, L. Mion, S. Canazza and A. Roda, "Physical movement and musical gestures: a multilevel mappingstrategy," in Proceedings of Sound and MusicComputing Conference, Salerno, 2005.
[5]   D. Findon, "Phantom Axe: A Motion Capture Air Guitar," School of Computer Science and Informatics, Cardiff University, September 2016.
[6]   A. R. Jensenius and K. A. V. Bjerkestrand, "Exploring micromovements with motion capture and sonification," 2011.
[7]   J. Norton, "Motion capture to build a foundation for a computer-controlled instrument by study of classical guitar performance," Stanford University, 2008.
[8]   K. Nymoen, S. A. Skogstad and A. R. Jensenius, "SoundSaber - A Motion Capture Instrument," in Proceedings of the International Conference on New Interfaces for Musical Expression, Oslo, Norway, 2011.
[9]   A. Perez-Carrillo, J. Arcos and M. Wanderley, "Estimation of Guitar Fingering and Plucking Controls Based on Multimodal Analysis of Motion, Audio and Musical Score," vol. 9617, 2016.
[10]   J. Romero, D. Tzionas and M. J. Black, "Embodied Hands: Modeling and Capturing Hands and Bodies Together," ACM Transactions on Graphics, (Proc. SIGGRAPH Asia), vol. 36, no. 6, 2017.
[11]   M. Song, "Hidden melody in music playing motion: Music recording using optical motion tracking system," in Proceedings of the 22nd International Congress on Acoustics, Buenos Aires, 2016.
[12]   S. Suh, J. Lee and W. S. Yeo, "A Gesture Detection with Guitar Pickup and Earphone," in Proceedings of The International Conference on New Interfaces for Musical Expression, Goldsmiths, University of London, 2014.
[13]   A. Tkach, A. Tagliasacchi, E. Remelli, M. Pauly and A. Fitzgibbon, "Online Generative Model Personalization for Hand Tracking," ACM Transactions on Graphics (Proceedings of SIGGRAPH Asia), vol. 36, no. 6, 2017.

Used Technologies

Synertial Motion Capture Gloves

  • Stand-alone, bundled with a suit, or multiple pairs for use with optical rigs
  • Self-calibrating sensors to adapt to magnetic environments
  • User-configured kinematic structures
  • Instant calibration (configurable calibration poses)
  • Replace batteries and continue capturing where you left off without SW restart or recalibration in MoBu, Unity, etc.
  • Occlusion free
  • Extremely accurate clock in the sync system for Optical rig integration (Vicon, Motion Analysis, Optitrack)
  • RealTime in Motionbuilder, Unity, Unreal, and Siemens Jack
  • Free user-friendly SDK and samples included with MoCap SW
  • Detachable electronics; change sizes in minutes
  • Short and long-fingered glove cloth types available
Source: https://www.synertial.com/mocapgloves

Our glove setup

Our glove setup

More technologies to be added soon...



1. Research

  collecting resources, similar projects

2. Implementation

  1. 1. Data collecting – realtime
  2.   setting up the equipment
  3.   capturing the motion
  4. 2. Data analyzing and processing - offline
  5.   device wrapper (c++)
  6.   optimization, filtration, synchronization
  7. 3. Data output
  8.   export into output file - BioVision Hierarchical motion capture data (.bvh) or such
  9.   optional: rendering in virtual reality (Unity) – animated figure

3. The initial chapter

  writing the first chapter of the bachelor thesis

4. Test and validation

  finalizing the application

5. Other chapters

  writing the rest of the thesis document


  • More coming soon...

  • Document - finishing implementation chapter

  • Recording session

    30/4 - 1/5/18
  • Adding Synertial data streamer to PhoXi camera API code

  • BVH loader - updating code to attach hand skeleton with forearm

  • New hand skeletons - with forearm bone

  • HTC Vive - setting up, troubleshooting, synchronizing with mocap glove - attaching tracker to a bone in skeleton file

  • Document - writing chapters on proposed solution and implementation

    March - April '18
  • Kinexact Hand - updating calibration tool to use new structures

    February '18
  • Finishing prototype

    Demo videos and source files

  • Programming BVH file loader, exporter + attaching hand mocap to body skeleton

  • Animating rendered model with recorded data

    Web preview of the initial model

  • Writing the initial chapter of the thesis

  • Making a presentation

    ppt pdf

  • Research of existing applications

    October 17'


Hands motion attached to the whole body skeleton.

Motion applied on the character rendered in Unity (video is a bit poor quality due to rendering and capturing video at the same time).

Complete .zip file contains:

  • initial chapter of the document
  • prototype source files
  • demo files
  • example character model (on which to apply motion demo files in Unity)
  • demo videos

  Download prototype ZIP file here


Dana Škorvánková

RNDr. Martin Madaras