For those that are looking for something to read, here's my Master's Thesis.
FEATURE SELECTION AND EVALUATION FOR GENRE CLASSIFICATION OF
SYMBOLICALLY ENCODED CLASSICAL MUSIC WITH THE AID OF MACHINE
Gustavo Cesar de Souza Frederico
Thesis submitted to the Faculty of Graduate and Postdoctoral Studies
in partial fulfillment of the requirements for the degree of Master of Computer Science
June 1st, 2006
Ottawa-Carleton Institute for Computer Science
School of Information Technology and Engineering
University of Ottawa
This work defines useful features for the classification of symbolically encoded music into 14 classical genres namely chorale, symphony, étude, fugue, prelude, contrafactum, sonata, mazurka, motet, sonatina, waltze, concerto, Gregorian chant and scherzo. Features are based on Music Theory and grouped into seven categories: distances in the harmonic möbius strip, distances on the line of fifths, scale, rhythmic syncopation and meter, polyphony measurements, duration and instrumentation. Features are extracted and ranked combining 5 filter-based methods. Six Machine Learning algorithms are defined for classification: three Support Vector Machines, one Bayesian network, the C4.5 and random forests. Using nested cross-validation for training and testing and considering all the features, the Bayesian network classifier yields 84.10 % empirical accuracy. The FEATUROMETRE process measures the usefulness of the feature subsets in an approach similar to wrapper methods, conveying relevant information to domain experts. Another experiment measures the usefulness and accuracy of features individually and by category using FEATUROMETRE. Grouping the music pieces by their period, the measured accuracy with the random forest classifier in the second experiment reaches 89.81 %.