Including collaborations and alumni. All my SNSF funded projects are on the SNSF Data Portal pages.

Current projects


Storytelling and first impressions in face-to-face and algorithm-powered digital interviews is a collaboration with Adrian Bangerter at UniNe and Marianne Schmid Mast at UNIL. We will try to automatically extract the information that the psychologists find to be indicative of job performance from interviews. The project funds Mutian He and the best parts of Daniel Carron.

NCCR Evolving Language

I work with James Henderson and Hervé Bourlard in “Low Level Mechanisms of Language Evolution”. We hope to model the junction of the human cochlea and the neural system and uncover the means by which the human auditory system can adapt to changing environments. It funds Louise


The Nature of Artificial Intelligence is an Agora project; with several Idiap colleagues, we will try to demonstrate speech synthesis in a museum context. The exhibition is now open at the Musée de la Main.


Neural Architectures for Speech Technology is a follow up to MASS in which we will generalise that work to generic neural architectures. We aim to embed physiologically and statistically plausible components into standard deep learning architectures to bring them closer to true analogues of biological systems. It funds Alex and Haolin.

Previous projects


ADeL can be thought of as building on the work done in DeepCharisma, bringing in audio visual modes to investigate how speech and gesture processing can help infer charismatic qualities. The work was funded by E4S; it was a collaboration with Daniel Gatica here at Idiap, with Marianne Schmid Mast and John Antonakis at UNIL and Jennifer Jordan and Alyson Meister at IMD.


DAHL was a collaboration with Swisscom in which Abbas Khosravani and I evaluated state of the art in ASR for Swisscom’s particular requirements. We were particularly interested in word-fragment-based methods and how robust they can be to the “orthographic ambiguity” of Swiss German.


With Petr Motlicek and Logitech, we aimed to bring speech to consumer grade embedded systems. Working with Niccolò and Vincent.


Multilingual Affective Speech Synthesis continued the work begun in SIWIS and SP2 by attempting to model the prosody of emotion. It funded Bastian.


Working with Olivier Bornet, John Antonakis and Dominic Rohner, we tried to infer charisma using deep learning.


SUMMA was an H2020 funded project on big data and multilingual speech processing.


With Alexandros Lazaridis, before he moved to Swisscom.


The SCOPES project on Speech Prosody was a collaboration with BME-TMIT in Budapest, FEEIT in Skopje and the RTD group at UNS in Novi Sad.


Spoken Interaction with Interpretation in Switzerland was a sinergia project funded by SNSF. At Idiap it funded Alexandros Lazaridis and Pierre-Edouard Honnet.


This work was with Petr Motlicek and Milos Cernak for the defence procurement arm of the Swiss government. We aim to use the synthesis technology from EMIME to do low bit rate coding.

Samsung GRO

Working with Petr Motlicek and Samsung’s SAIT lab under the GRO program on advanced multi-lingual acoustic modelling for speech recognition. The project continued, funded directly by SAIT.


This was a project aiming to do speech recognition and indexing of the Valaisan parliament. It was run by Alexandre Nanchen; I helped where possible along with György Szaszak.


Another Hasler Foundation project related to CLAS3, but focussed on fast adaptation using vocal tract normalisation methods developed by Lakshmi Saheer.


This was a Hasler Foundation funded project allowing Hui Liang to finish his PhD work. The cross lingual adaptation techniques that Hui developed are the basis for SIWIS.


I worked on EMIME with John Dines, Lakshmi Saheer and Hui Liang. That project continued in a Swiss sense as SIWIS.


I worked on AMIDA until it finished at the end of 2009. That work continued under IM2. It was closely related to the TA2 project.