NMF vs Neural Nets

Neural Network Alternatives to Convolutive Audio Models for Source Separation

*** Best Student Paper Award ***

Convolutive Non-Negative Matrix Factorization model factorizes a given audio spectrogram using frequency templates with a temporal dimension.

In this work, we develop a convolutional auto-encoder that acts as a neural network alternative to convolutive NMF.

Using the modeling flexibility granted by neural networks, we also explore the idea of using a Recurrent Neural Network in the encoder.

Experimental results on speech mixtures from TIMIT dataset indicate that the convolutive architecture provides a significant improvement in separation performance in terms of BSS\_eval metrics.

[Paper] [Code] [Presentation]

A Neural Network Alternative to Non-negative Audio Models

We present a neural network that can act as an equivalent to a Non- Negative Matrix Factorization (NMF).

Next, we show how it can be used to perform supervised source separation.

Due to the extensibility of this approach we show how we can achieve better source separation performance as compared to NMF-based methods.

Finally, we also propose a variety of derivative architectures that can be used for further improvements.

[Paper] [Code] [Presentation]