By Manuela Hürlimann, ZHAW and Thomas Zaugg, Roche
On May 10th, 2023, the “Natural Language Processing in Action” Expert Group of the data innovation alliance and SwissNLP organised a meeting in Zurich with three exciting presentations on speech processing.
Oscar Koller, a principal applied scientist at Microsoft, presented on the use of end-to-end neural systems for automatic speech recognition in Swiss German. He discussed how the current industry paradigm of hybrid ASR is being replaced with end-to-end models, such as those that have been winning recent benchmarks. Oscar shared the results of his team’s comparison of different neural network architectures, and highlighted the advantages of using transducers for improved real-time performance in their work.
Claudio Paonessa, a researcher at FHNW, discussed how recent advances in speech-to-text, text-to-speech, and translation for Swiss German can be combined with a large language model to create a voice-based conversational assistant. He shared a demo of the model in action, showcasing its ability to give apt replies. However, he also acknowledged that processing time still needs to be reduced to give a real-time feeling, and suggested reducing model size as one possible solution.
Dr. Edith Birrer, a senior researcher at iHomeLab, HSLU, presented results from her team’s work on using speech processing in the context of home care. Together with international project partners, they ran interviews and workshops to identify potential use cases for home care workers. While they had originally planned to focus on care documentation, their results showed that most care workers found supporting services – such as a to-do list that can be ticked off verbally – to be more useful. They implemented three use cases and tested them in a lab with carers, showing a high level of enthusiasm among users, but emphasizing the need to address data privacy concerns before such technologies can become widely used.
After the presentations, attendees enjoyed an apéro and continued discussing the topics at hand.