Multi-channel source separation with video data

Mosayyebpour, Sahand

Multi-channel source separation with video data

Files

University_of_Victoria__UVic__LaTeX_thesis_template___2020 (4).pdf (2.7 MB)

Date

2024

Authors

Mosayyebpour, Sahand

Abstract

This research introduces a supervised multi-channel audio source separation system that integrates a video-based face detection system. The face detector identifies the nose position, aiding the multi-channel processing in isolating the primary speaker while suppressing environmental background noise and distracting secondary speakers. It is demonstrated that in far-field applications, multi-channel processing struggles with distracting secondary speakers when the primary speaker position is unknown. Utilizing video data provides valuable insights to identify the target speaker and assists the audio source separation system in directing its focus towards the target speaker. Furthermore, it is shown that multi-channel processing benefits from speaker position information to improve noise reduction in noisy reverberant environments.

URI

https://hdl.handle.net/1828/20872

Collections

Electronic Theses and Dissertations (ETD)
Theses (Electrical and Computer Engineering)

Full item page

Multi-channel source separation with video data

Files

Date

Authors

Journal Title

Journal ISSN

Volume Title

Publisher

Abstract

Description

Keywords

Citation

URI

Collections