Report EDI-INF-RR-0661

Informatics Report Series

Report

EDI-INF-RR-0661

Related Pages

Report (by Number) Index
Report (by Date) Index
Author Index
Institute Index

Home

Title:Speech and crosstalk detection in multi-channel audio

Authors: S. J. Wrigley ; G J Brown ; V Wan ; Steve Renals

Date:Jan 2005

Publication Title:IEEE Transactions on Speech and Audio Processing

Publisher:IEEE Signal Processing Society

Publication Type:Journal Article Publication Status:Published

Volume No:# 13(1) Page Nos:84-91

DOI:10.1109/TSA.2004.838531

Abstract:: The analysis of scenarios in which a number of microphones record the activity of speakers, such as in a roundtable meeting, presents a number of computational challenges. For example, if each participant wears a microphone, it can receive speech from both the microphone's wearer (local speech) and from other participants (crosstalk). The recorded audio can be broadly classified in four ways: local speech, crosstalk plus local speech, crosstalk alone and silence. We describe two experiments related to the automatic classification of audio into these four classes. The first experiment attempted to optimise a set of acoustic features for use with a Gaussian mixture model (GMM) classifier. A large set of potential acoustic features were considered, some of which have been employed in previous studies. The best-performing features were found to be kurtosis, fundamentalness and cross-correlation metrics. The second experiment used these features to train an ergodic hidden Markov model classifier. Tests performed on a large corpus of recorded meetings show classification accuracies of up to 96\%, and automatic speech recognition performance close to that obtained using ground truth segmentation.

Links To Paper
1st Link
2nd Link

Bibtex format
@Article{EDI-INF-RR-0661,: author = { S. J. Wrigley and G J Brown and V Wan and Steve Renals },; title = {Speech and crosstalk detection in multi-channel audio},; journal = {IEEE Transactions on Speech and Audio Processing},; publisher = {IEEE Signal Processing Society},; year = 2005,; month = {Jan},; volume = {# 13(1)},; pages = {84-91},; doi = {10.1109/TSA.2004.838531},; url = {http://ieeexplore.ieee.org/xpls/abs_all.jsp?isnumber=29967&arnumber=1369314&count=12&index=7},
}

Home : Publications : Report

Please mail <reports@inf.ed.ac.uk> with any changes or corrections.
Unless explicitly stated otherwise, all material is copyright The University of Edinburgh