» Door Joren op maandag 12 december 2011
Zondag 18 december 2011 gaf ik een workshop voor de Gentse kinderuniversiteit. Het thema van de kinderuniversiteit was Muziek onder de microscoop. De teaser voor de workshop is hier te vinden:
WORKSHOP – Muziek (ont)luisteren op de computer
Is het mogelijk om piano te spelen op een tafel? Kan een computer luisteren naar muziek en er van genieten? Wat is muziek eigenlijk, en hoe werkt geluid?
Tijdens deze workshop worden de voorgaande vragen beantwoord met enkele computerprogramma’s!
Concreet worden enkele componenten van geluid (en bij uitbreiding, muziek) gedemonstreerd met computerprogrammaatjes gemaakt in het conservatorium:
- Geluidssterkte: een decibel-meter met een bepaalde drempelwaarde. Probeer zo luid mogelijk te doen en zie hoe moeilijk het is om, eens een bepaald niveau bereikt is, in decibel te stijgen.
- Toonhoogte: een klein spelletje om toonhoogte aan te tonen. Probeer zo juist mogelijk te zingen of te fluiten en vergelijk je score.
- Percussie: dit programma reageert op handgeklap. Hoe kan je het onderscheid maken tussen bijvoorbeeld een fluittoon en handgeklap?
De foto’s hieronder geven een sfeerbeeld.
Presentation, Dutch, Java, en TarsosDSP
logo-with-arrow.png, UtterAsterisk.jar, PercussionDetector.jar, PitchDetector.jar, en SoundDetector.jar
» Door Joren op woensdag 09 november 2011
The aim of acoustic fingerprinting is to generate a small representation of an audio signal that can be used to identify or recognize similar audio samples in a large audio set. A robust fingerprint generates similar fingerprints for perceptually similar audio signals. A piece of music with a bit of noise added should generate an almost identical fingerprint as the original. The use cases for audio fingerprinting or acoustic fingerprinting are myriad: detection of duplicates, identifying songs, recognizing copyrighted material,…
Using a pitch class histogram as a fingerprint seems like a good idea: it is unique for a song and it is reasonably robust to changes of the underlying audio (length, tempo, pitch, noise). The idea has probably been found a couple of times independently, but there is also a reference to it in the literature, by Tzanetakis, 2003: Pitch Histograms in Audio and Symbolic Music Information Retrieval:
Although mainly designed for genre classification it is possible that features derived from Pitch Histograms might also be applicable to the problem of content-based audio identification or audio fingerprinting (for an example of such a system see (Allamanche et al., 2001)). We are planning to explore this possibility in the future.
Unfortunately they never, as far as I know, did explore this possibility, and I also do not know if anybody else did. I found it worthwhile to implement a fingerprinting scheme on top of the Tarsos software foundation. Most elements are already available in the Tarsos API: a way to detect pitch, construct a pitch class histogram, correlate pitch class histograms with a pitch shift,… I created a GUI application which is presented here. It is, probably, the first open source acoustic / audio fingerprinting system based on pitch class histograms.

It works using drag and drop and the idea is to find a needle (an audio file) in a hay stack (a large amount of audio files). For every audio file in the haystack and for the needle pitch is detected using an optimized, for speed, Yin implementation. A pitch class histogram is created for each file, the histogram for the needle is compared with each histogram in the hay stack and, hopefully, the needle is found in the hay stack.
Unfortunately I do not have time for rigorous testing (by building a large acoustic fingerprinting data set, or an other decent test bench) but the idea seems to work. With the following modifications, done with audacity effects the needle was still found a hay stack of 836 files :
- A 10% speedup
- 15 and 30 seconds removed form the needle (a song of 4 minutes 12 seconds)
- White noise added
- Reversed the audio (This is, I believe, a rather unique property of this fingerprinting technique)
- GSM reencoded
The following modifications failed to identify the correct song:
- A one semitone pitch shift
- A two semitone pitch shift
- 60 seconds removed from the needle
The original was also found. No failure analysis was done. The hay stack consists of about 100 hours of western pop, the needle is also a western pop song. If somebody wants to pick up this work or has an acoustic fingerprinting data set or drop me a line at .
The source code is available, as always, on the Tarsos GitHub page.
Audio Fingerprinting Results
Audio Fingerprinting Query
Large scale results
Code, Java, Music Information Retrieval, featured, en Tarsos
x360-dc445.audio_fingerprinting_query.png en AudioFingerprinter.jar
» Door Joren op maandag 22 augustusus 2011
A paper about Tarsos was submitted for review at the 12th International Society for Music Information Retrieval Conference which will be held in Miami. The paper Tarsos – a Platform to Explore Pitch Scales in Non-Western and Western Music was reviewed and accepted, it will be published in this year’s proceedings of the ISMIR conference. It can be read below as well.
An oral presentation about Tarsos is going to take place Tuesday, the 25 of October during the afternoon, as can be seen on the ISMIR preliminary program schedule.
If you want to cite our work, please use the following data:
1
2
3
4
5
6
7
8
9
10
|
@inproceedings{six2011tarsos,
author = {Joren Six and Olmo Cornelis},
title = {Tarsos - a Platform to Explore Pitch Scales
in Non-Western and Western Music},
booktitle = {Proceedings of the 12th International
Society for Music Information Retrieval Conference,
ISMIR 2011},
year = {2011},
publisher = {International Society for Music Information Retrieval}
} |
Computational ethnomusicology, Research papers, Music Information Retrieval, Code, featured, en Tarsos
tarsos_ismir_paper.bibtex.txt en tarsos_ismir_2011.pdf
» Door Joren op woensdag 21 december 2011
- Reageer
To prevent confusion about pitch representation in general and pitch representation in Tarsos specifically I wrote a document about pitch, pitch Interval, and pitch ratio representation. The abstract goes as follows:
This document describes how pitch can be represented using various units. More specifically it documents how a software program to analyse pitch in music, Tarsos, represents pitch. This document contains definitions of and remarks on different pitch and pitch interval representations. For good measure we need a definition of pitch, here the definition from [McLeod 2009] is used: The pitch frequency is the frequency of a pure sine wave which has the same perceived sound as the sound of interest. For remarks and examples of cases where the pitch frequency does not coincide with the fundamental frequency of the signal, also see [McLeod 2009] . In this text pitch, pitch interval and pitch ratio are briefly discussed.
Tarsos
pitch_representation.pdf