Anchors and hubs in audio-based music similarity

Adam Berenzweig
Columbia University, NY, USA (2007)


Content-based music discovery, retrieval, and management tools become increasingly important as the size of personal music collections grows larger than several thousand songs. Furthermore, they can play a crucial role in the music marketing process as music discovery moves online, which which benefits both consumers and artists.

This dissertation describes our work on computing music similarity measures from audio. The basic approach is to compute short-time spectral features, model the distributions of these features, and then compare the models. Several choices of features, models, and comparison techniques are examined, including a method of mapping audio features into a semantic space called anchor space before modeling. A practical problem with this technique, known as the hub phenomenon, is explored, and we conclude that it is related to the curse of dimensionality.

Music similarity is inherently subjective, context-dependent, and multi-dimensional, and so there is no single ground truth for training and evaluation. Therefore, some effort has gone into exploring different sources of subjective human opinion and objective ground truth, and developing evaluation metrics that use them.

[BibTex, External Link, Return]