Who attended the 2010 Autumn School?

Greg asked everyone - including the SoundSoftware.ac.uk organisers! - to come up with a short and snappy introduction to their research work: here are some of them.

Amy Beeston, University of Sheffield

Compensation for reverberation

Reverberation adversely affects artificial listening devices, and automatic speech recognition in particular suffers high error rates with even a minimal level of reflected sound energy. Our solution lies in the development of computational auditory models based on psychoacoustic principles of hearing. Like people, these models absorb information from contextual sound in order to improve the recognition of spoken words in reverberant rooms. Unlike existing methods of dereverberation, our approach allows us to consider the rapid changes in acoustical environments that are experienced in every-day, real-room listening situations.

Andrea De Marco, University of East Anglia

Intelligent speaker identification

We are studying the problem of reliable and robust speaker identification that affects forensic scientists, security systems, and ambient intelligence systems, impacting models that currently are well behind simple, untrained cognitive processes.

Our solution looks at directly modeling relevant human cognitive processes using the best machine learning techniques as well as cognitive perception theory available unlike the state of the art systems that use simple statistical models valid for small populations or excessive test data [D.A. Reynolds 95/2000]. It will perform on par or better than basic untrained human speaker identification.

Anne Wheatley, University of Southampton

Music perception in cochlear implant users

Cochlear implant users are not able to perceive music as well as they can understand speech. Music perception tests enable us to determine the ability of cochlear implant users to comprehend musical sounds, in order to improve cochlear implant technology.

This research affects cochlear implant users, candidates, clinicians and cochlear implant manufacturers.

We are currently reviewing the trends in music perception test use in UK cochlear implant centres. We are also reviewing currently available music perception test materials using listeners with normal hearing and cochlear implant users.

Becky Stewart, Queen Mary, University of London

Stop Looking, Start Listening

Incorporating interactive audio interfaces into music discovery.

Text searches give us words as results. Image searches give us pictures as results. Music searches do not give us songs as results. Instead, we usually get a list of song titles. It then takes several mouse clicks before we listen to any music. Interfaces which incorporate more listening can be a faster way to find music than a standard interface like iTunes. We create interfaces which let you listen immediately to help you find music quicker.

Ben Fields, Goldsmiths, University of London

Contextualize Your Listening

Viewing Playlists as a Vehicle for Music Recommendation.

Time is what makes music different. To music recommenders work better I exploit time, through the use of sequential ordering of recommended music and playlist generation. Further my work entails a better understanding of the existing state of the art in playlist generation and the dependency on notions of music similarity. I've created datasets that contain both audio signal and social connections, using this to create a multimodal automatic generated similarity space. Using this similarity space, I've created a group radio web application, creating playlists based on periodic request from current listeners. This system is live at http://radio.benfields.net . I've also been working on ways to describe and compare playlists, leading to distance spaces for playlists, based on social tags.

Chris Hummersone, University of Surrey

Modelling the precedence effect

I build models of the precedence effect, including its dynamic components, in order to localise and separate mixtures of sounds.

The precedence effect has been observed in psychoacoustics for many years. Although many computational models of precedence exist, they do not yet include a component that accounts for dynamic processes; these processes appear to adjust precedence to the room in which the listener is located. Data from the models that I have built confirm that there is a necessity for an equivalent computational mechanism that I am currently working towards. This will drastically improve separation based on binaural cues. The technology could be used in many areas including intelligent hearing aids and front-end processors for speech recognisers.

Daniel Wolff, City University London

Culture-Aware Music Recommendation

Integrating cultural bias into computational models for music similarity.

Modern music recommendation systems provide valuable tools for both advertisement and exploration of musical contents. As a user, if you are not familiar with the music you are searching for, there's a good chance these system won't match up to your expectations. Our music similarity model integrates cultural bias and therefore aims to adapt results to your particular cultural identity.

Eoin Mullan, Queen's University Belfast

Physical Modelling for Sound Synthesis in Computer Games

Real time physical modelling techniques are providing more realistic and varied sound effects in computer games.

The problem of creating realistic sound effects in computer games is being largely ignored as developers continue to use decades old sample playback techniques, while graphics, physics and artificial intelligence algorithms continue to improve. Our solution is to model the sound producing vibrations of virtual objects based on their physical properties and information, often from a physics engine, on the types of interaction that occur. Unlike the current practice of sample playback, which is labour intensive to implement and can become repetitive and unrealistic for the user, our technique creates more realistic, varied sound effects and a more immersive gaming experience.

Erich Zwyssig, Edinburgh University

Speaker Diarisation in Meetings

Speaker Diarisation in Meetings ("who spoke when") is essential for the accurate speaker and speech recognition necessary for automatic meeting transcription, summarisation and list of action/decision generation.

Detecting the presence of speech and isolating and merging individual speakers in recordings of meetings is still on open research issue. This research aims to improve the performance of voice activity detection (VAD) and speech segment clustering and merging, i.e. speaker diarisation. Typical problems in meeting recordings are, for example, noise, reverberation or overlapping speech. These degrade the performance of speech processing and methods and algorithms to overcome them are needed.

Henry Lindsay Smith, Queen Mary, University of London

Automatically generating kits from drum loops is a problem for users of mpc-like audio plugins, who currently have to manually classify sliced loops. Our solution aims to automate all or much of this process, enabling 100% correct classification within a few clicks, in order to improve creative workflow - unlike other drum machine style plugins which lack this feature.

Jens Enzo Nyby Christensen, Cambridge University

Cheap touch screens

A software-only implementation of touch screen functionality.

Decreasing profit margins and the increased demand for touch screen functionality in the telecommunications industry, means that the industry is constantly looking for cheaper ways to give the user what they want. Acoustic Pulse Recognition enables the mobile phone manufacturer to supply touch screens at a fraction of the typical touch screen cost, with extra features and increased reliability.

Katrin Skoruppa, University College London

Kids Hear Language

Exploring the link between speech perception and language outcome in children with hearing impairment.

About 2 in 100 children suffer from hearing impairment. Conventional hearing aids and, since recently, cochlear implants allow the restoration of some of their auditory capacities, an important prerequisite for oral language acquisition. However, the speech signal they perceive remains impoverished and distorted. We study how children with hearing impairment learn language despite these input limitations. More specifically, we investigate whether they benefit from the extraordinary language learning mechanisms that children with normal hearing use to acquire their native language with surprising speed and ease.

Mathieu Barthet, Queen Mary, University of London

Musicology for everyone

New music technologies designed to fit user needs.

The recent increase of digitized music archives has launched the development of new computational models to process music content. Music software based on machine learning techniques and non-stationnary signal processing are able to perform complex analysis and visualisation tasks, however little has been done to investigate whether they were adapted to specific user needs, or how to improve their usability. Our research will focus on one category of users, the musicologists, who are liable to use such technologies in the formal analysis of music. We will conduct an ethnographic study based on naturalistic observation and interviews to better understand the processes underlying musicological research and provide innovative solutions to enhance human/computer interaction in computational musicology.

Michael Gatt, De Montfort University

New Tools to Analyse Electroacoustic music!

We will develop an analysis toolbox to allow better understanding of compositions within the realm of electroacoustic music.

Within the field of electroacoustic music there exists no universal strategy to analysis it. This affects both the electroacoustic community and new listeners that want to gain a better understanding of the music.

Our solution is to create a toolbox of analytical tools that will aid users to create graphical scores for musical analysis. Current programs such as the Acousmographe only allow a user to create these scores and provide no help or guidance for the end user. Our program will have built in aids, based on the toolbox, that will guide the end user by suggesting analytical methods that will be implemented within the program.

And our takes on the SoundSoftware.ac.uk project:

Chris Cannam, SoundSoftware.ac.uk

Software skills for audio research

SoundSoftware project aims to help audio researchers by building their software skills.

Research students find it difficult to manage the software tools they need to produce and validate their work. We aim to teach the skills they need and to provide facilities they can use to make their lives easier and their research more sustainable.

Luis Figueira, SoundSoftware.ac.uk

Achieving Sustainability in Research

The problem of almost nonexistent good practice for software development deeply affects the Audio and Music Research community, who currently have difficulty reusing other researchers' software or even reproducing their own results.

Our project aims at offering the researchers the tools they need—either by teaching good practice on software development, or by giving access to specific tools, like code repositories or documentation tools.

This project’s team has many years of experience in the area, having suffered from the same problems we’re addressing—and therefore we’re highly motivated to improve the current situation!