Real-time gesture-based sound control system

Date

2024

Authors

Khazaei, Mahya

Journal Title

Journal ISSN

Volume Title

Publisher

Abstract

This thesis presents a real-time, human-in-the-loop music control and manipulation system that dynamically adapts audio outputs based on the analysis of human movement captured via live-stream video. This project creates a responsive link between visual and auditory stimuli, fostering an interactive experience where dancers not only respond to music but dynamically influence it through their movements. The system enhances live performances, interactive installations, and personal entertainment, creating an immersive experience where users’ movements directly shape the music in real time. This project demonstrates how machine learning and signal processing techniques can create responsive audio-visual systems that evolve with each movement, bridging human interaction and machine response in a seamless loop. The system leverages computer vision techniques and machine learning tools to track and interpret the motion of individuals dancing or moving, enabling them to participate actively in shaping audio adjustments, such as tempo, pitch, effects, and playback sequence in real time. Constantly improving through ongoing training, the system allows users to generalize models for user-independent use by providing varied samples; around 50–80 samples are typically sufficient to label a simple gesture. Through an integrated pipeline of gesture training, cue mapping, and audio manipulation, this human-centered system continuously adapts to user input. Gestures are trained as signals from human to model, mapped to sound control commands, and then used to naturally manipulate audio elements.

Description

Keywords

Citation