Calendrier

Automatic speaker verification from affective speech using Gaussian mixture model based estimation of neutral speech characteristics

Abstract: Intra-speaker variability, caused by emotional speech, is a real threat to the performance of speaker recognition systems. In fact, as human beings, we are constantly changing our emotional state. While many efforts have been made to increase automatic speaker verification (ASV) robustness towards channel effects or spoofing attacks, only a handful of studies have addressed the detrimental consequences of affective speech. In this work, we propose a new method to minimize the mismatch between neutral and affective speech. To this end, a Gaussian mixture model is used to learn a prior probability distribution of the neutral speech for a given speaker (i.e., characterizing his/her source space). This knowledge is then used to minimize the differences between target (affective) and source (neutral) spaces. The proposed method is validated across four multilingual emotional datasets. Experimental results show a consistent improvement in performance across eight emotional states, with significant reductions of equal error rate relative to the baseline.

Bio: Prof. Anderson Ávila is an Assistant Professor at INRS-EMT, working in the INRS-UQO Mixed Research Unit in Cybersecurity. Prior to joining INRS-UQO, Dr. Avila was a researcher scientist in natural language and speech processing, working on projects related to model compression, low-latency and robustness of spoken language understanding. His main research interests are in data privacy via federated learning, combating misinformation using AI, and robustness of biometrics.

Date

Thursday April 13, 2023
From 11:00 to 12:00

Contact

Place

Polytechnique Montréal - Lassonde Buildings
2700, chemin de la Tour
Montreal, Québec
Canada
M-2103

Categories

Sectors