Music Information Retrieval Technology

Alexandra L. Uitdenbogerd
RMIT University, Melbourne, Victoria, Australia (July, 2002)


The field of Music Information Retrieval research is concerned with the problem of locating pieces of music by content, for example, finding the best matches in a collection of music to a particular melody fragment. This is useful for applications such as copyright-related searches.

In this work we investigate methods for the retrieval of polyphonic music stored as musical performance data using the MIDI standard file format. We devised a three-stage approach to melody matching consisting of melody extraction, melody standardisation, and similarity measurement. We analyse the nature of musical data, compare several novel melody extraction techniques, describe many melody standardisation techniques, develop, and compare various melody similarity measurement techniques, and also develop a method for evaluating the techniques in terms of the quality of answers retrieved, based on approaches developed withing the Information Retrieval community.

We have found that a technique that was judged to work well for extracting melodies consists of selecting the highest pitch note that starts at each instant.

We have tested a variety of methods for locating similar pieces of music. The best techniques found thus far are local alignment of intervals and coordinate matching based on n-grams, with n from 5 to 7.

In addition, we have compiled a collection of MIDI files, the representations of automatically extracted melodies of these files, a query set based on the extracted melodies, a collection of manual queries, and inferred and human relevance judgements. Experiments using these sets show that the type of query used in testing a system makes a significant difference in the outcome of an evaluation. Specifically, we found that manual queries performed best with a representation of the music collection that keeps a "melody" extracted from each instrumental part, and short automatically generated queries performed better when matched agains a representation of the collection that, for each piece, uses the melody extraction approach described above.

