In this paper we use content-based features to perform automatic classification of music pieces into genres. We categorise these features into four groups: features extracted from the Fourier transform’s magnitude spectrum, features designed to inform on tempo, pitch-related features, and chordal features. We perform a novel and thorough exploration of classification performance for different feature representations, including the mean and standard deviation of its distribution, by a histogram of various bin sizes, and using mel-frequency cepstral coefficients. Finally, the paper uses information gain ranking to present a pruned feature vector used by six off-the-shelf classifiers. Logistic regression achieves the best performance with an 81% accuracy on 10 GTZAN genres.

