~ Software for Music Analysis

Friday the second of December I presented a talk about software for music analysis. The aim was to make clear which type of research topics can benefit from measurements by software for music analysis. Different types of digital music representations and examples of software packages were explained.

software for music analysis

Following presentation was used during the talk. (ppt, odp):

  • Sonic Visualizer: As its name suggests Sonic Visualizer contains a lot different visualisations for audio. It can be used for analysis (pitch,beat,chroma,…) with VAMP-plugins. To quote “The aim of Sonic Visualiser is to be the first program you reach for when want to study a musical recording rather than simply listen to it”. It is the swiss army knife of audio analysis.
  • BeatRoot is designed specifically for one goal: beat tracking. It can be used for e.g. comparing tempi of different performances of the same piece or to track tempo deviation within one piece.
  • Tartini is capable to do real-time pitch analysis of sound. You can e.g. play into a microphone with a violin and see the harmonics you produce and adapt you playing style based on visual feedback. It also contains a pitch deviation measuring apparatus to analyse vibrato.
  • Tarsos is software for tone scale analysis. It is useful to extract tone scales from audio. Different tuning systems can be seen, extracted and compared. It also contains the ability to play along with the original song with a tuned midi keyboard .

To show the different digital representations of music one example (Liebestraum 3 by Liszt) was used in different formats:

Tartini

Tartini

Melodic Match

Melodic Match

Sonic Visualizer

Sonic Visualizer

Tarsos

Tarsos

Digital music representations

Digital music representations

Software for music analysis

Software for music analysis

 

~ Tarsos presentation at 'ISMIR 2011'

Tarsos LogoOlmo Cornelis and myself just gave a presentation about Tarsos at the at the 12th International Society for Music Information Retrieval Conference which is held at Miami.

The live demo we gave went well and we got a lot of positive, interesting feedback. The presentation about Tarsos is available here.

It was the first time in the history of ISMIR that there was a session with oral presentations about Non-Western Music. We were pleased to be part of this.

The peer reviewed paper about our work: Tarsos – a Platform to Explore Pitch Scales in Non-Western and Western Music is available from the ISMIR website and embedded below:

~ PeachNote Piano at the ISMIR 2011 demo session

PeachNote Piano SchemaThe extended abstract about PeachNote Piano has been accepted as a demonstration presentation to appear at the ISMIR 2011 conference in Miami. To know more about PeachNote Piano come see us at our demo stand (during the Late Breaking and Demo Session) or read the paper: Peachnote Piano: Making MIDI instruments social and smart using Arduino, Android and Node.js. What follows here is the introduction of the extended abstract:

Playing music instruments can bring a lot of joy and satisfaction, but not all apsects of music practice are always enjoyable. In this contribution we are addressing two such sometimes unwelcome aspects: the solitude of practicing and the “dumbness” of instruments.

The process of practicing and mastering of music instruments often takes place behind closed doors. A student of piano spends most of her time alone with the piano. Sounds of her playing get lost, and she can’t always get feedback from friends, teachers, or, most importantly, random Internet users. Analysing her practicing sessions is also not easy. The technical possibility to record herself and put the recordings online is there, but the needed effort is relatively high, and so one does it only occasionally, if at all.

Instruments themselves usually do not exhibit any signs of intelligence. They are practically mechanic devices, even when implemented digitally. Usually they react only to direct actions of a player, and the player is solely responsible for the music coming out of the insturment and its quality. There is no middle ground between passive listening to music recordings and active music making for someone who is alone with an instrument.

We have built a prototype of a system that strives to offer a practical solution to the above problems for digital pianos. From ground up, we have built a system which is capable of transmitting MIDI data from a MIDI instrument to a web service and back, exposing it in real-time to the world and optionally enriching it.

A previous post about PeachNote Piano has more technical details together with a video showing the core functionality (quasi-instantaneous USB-BlueTooth-MIDI communication). Some photos can be found below.

PeachNote Piano enclosure

PeachNote Piano enclosure

PeachNote Piano in action

PeachNote Piano in action

PeachNote Piano Schema

PeachNote Piano Schema

PeachNote Piano Arduino Shield

PeachNote Piano Arduino Shield

PeachNote Piano assembled

PeachNote Piano assembled

 

~ Makam Recognition with the Tarsos API

This article describes how to do makam recognition with a script that uses the Tarsos API.

The task we want to do is to find the tone scales most similar to the one used in recorded music. To complete this task you need a small set of theoretical scales and a large set of music, each brought in one of the scales. To make it more concrete, an example of Turkish classical music is used.

In an article by Bozkurt pitch histograms are used for – amongst other tasks – makam recognition. A maqam defines rules for a composition or performance of classical Turkish music. It specifies melodic shapes and pitch intervals, the scale. The task is to identify which of nine makams is used in a specific song. A simplified, generalized implementation of this task is shown here. In our implementation there is no tonic detection step. Also here we use only theoretical descriptions of the tone scales as a template and do not construct a template using the audio itself, as is done by Bozkurt. Ioannidis Leonidas wrote an interesting master thesis about makam recognition. Since no knowledge of the music itself is used the approach is generally applicable.

The following is an implementation in Scala a general purpose programming language that is interoperable with Jave . The first step is to write the Scala header. This is just some boilerplate code to be able to run the script from the command line – it assumes a UNIX-like environment and tarsos.jar in the same directory:

1
2
3
4
5
#!/bin/sh
exec scala  -cp tarsos.jar -savecompiled "$0" "$@"
!#
import be.hogent.tarsos.util._
//other import statements

The second step constructs the templates the capability of Tarsos to create
theoretical tone scale templates using Gaussian kernels is used, line 8. See the attached images for some examples.

1
2
3
4
5
6
7
8
9
10
11
val makams = List(        "hicaz","huseyni","huzzam","kurdili_hicazar",
                                        "nihavend","rast","saba","segah","ussak")

var theoreticKDEs = Map[java.lang.String,KernelDensityEstimate]()
makams.foreach{ makam =>
  val scalaFile =  makam + ".scl"
  val scalaObject = new ScalaFile(scalaFile);
  val kde = HistogramFactory.createPichClassKDE(scalaObject,35)
  kde.normalize
  theoreticKDEs = theoreticKDEs + (makam -> kde)
}

The third and last step is matching. First a list of audio
files is created by recursively iterating a directory and matching each file to
a regular expression. Next, starting from line 4, each audio file is processed.
The internal implementation of the YIN pitch detection
algorithm is used on the audio file and a pitch class histogram is created
(line 6,7). On line 10 normalization of the histogram is done, to
make the correlation calculation meaningful. Line 11 until 15 compare the
created histogram from the audio file with the templates calculated beforehand.
The results are stored, ordered and eventually printed on line 19.

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
val directory = "/home/joren/turkish_makams/"
val audio_pattern = ".*.(mp3|wav|ogg|flac)"
val audioFiles = FileUtils.glob(directory,audio_pattern,true).toList

audioFiles.foreach{ file =>
  val audioFile = new AudioFile(file)
  val detectorYin = PitchDetectionMode.TARSOS_YIN.getPitchDetector(audioFile)
  val annotations = detectorYin.executePitchDetection()
  val actualKDE = HistogramFactory.createPichClassKDE(annotations,15);
  actualKDE.normalize    
  var resultList = List[Tuple2[java.lang.String,Double]]()
  for ((name, theoreticKDE) <- theoreticKDEs){
      val shift = actualKDE.shiftForOptimalCorrelation(theoreticKDE)
      val currentCorrelation = actualKDE.correlation(theoreticKDE,shift)
      resultList =  (name -> currentCorrelation) :: resultList
  }
  //order by correlation
  resultList = resultList.sortBy{_._2}.reverse
  Console.println(file + " is brought in tone scale " + resultList(0)._1)
}

A complete version of this script can is available: Tone scale matching script Results of the script when ran on Bozkurt’s dataset can be seen in the attached spreadsheet (openoffice format or excel format).

Theoretical template

Theoretical template

Other theoretical template

Other theoretical template

Actual Hicaz song overlayed with a theoretical template

Actual Hicaz song overlayed with a theoretical template

 

~ Tarsos at 'ISMIR 2011'

Tarsos LogoA paper about Tarsos was submitted for review at the 12th International Society for Music Information Retrieval Conference which will be held in Miami. The paper Tarsos – a Platform to Explore Pitch Scales in Non-Western and Western Music was reviewed and accepted, it will be published in this year’s proceedings of the ISMIR conference. It can be read below as well.

An oral presentation about Tarsos is going to take place Tuesday, the 25 of October during the afternoon, as can be seen on the ISMIR preliminary program schedule.

If you want to cite our work, please use the following data:

1
2
3
4
5
6
7
8
9
10
@inproceedings{six2011tarsos,
  author     = {Joren Six and Olmo Cornelis},
  title      = {Tarsos - a Platform to Explore Pitch Scales 
                in Non-Western and Western Music},
  booktitle  = {Proceedings of the 12th International 
                Society for Music Information Retrieval Conference,
                ISMIR 2011},
  year       = {2011},
  publisher  = {International Society for Music Information Retrieval}
}

~ Resynthesis of Pitch Detection Annotations on a Flute Piece

Tarsos, a software package to analyse pitch organization in music, contains a new output modality. Now it is possible to export resynthesized pitch annotations, detected by a pitch detection algorithm and compare those with the original sound. This can be interesting to see which errors a pitch detection algorithm makes.

Below you can listen to an example of synthesized pitch detection results compared with the original flute piece. The file starts with only the original flute sound (on the right channel) and gradually changes so only the synthesized annotations (on the left channel) can be heard.

Resynthesis of Pitch Detection Annotations on a Flute Piece by Joren Six

~ Tarsos in het jaarboek Orpheus instituut

Naar jaarlijkse gewoonte wordt er in het Orpheus instituut de Dag van het Artistiek onderzoek georganiseerd. Hieronder volgt een tekstje over het onderzoeksproject rond Tarsos dat in het jaarboek komt. Het jaarboek is een boekje met daarin een overzicht van artistieke onderzoeksprojecten aan Vlaamse instituten. Het wordt gepubliceerd naar aanleiding van de eerder aangehaalde “Dag van het Artistiek Onderzoek”.

Tarsos LogoHet doel van dit onderzoeksproject is het ontwikkelen van een methode om een cultuuronafhankelijke kijk op muzikale parameters te verkrijgen. Meer concreet worden er technieken aangewend uit Music Information Retrieval om toonhoogte, tempo en timbre te bestuderen. Aanpassing van bestaande, meestal westers georiënteerde, MIR-methodes moet leiden tot een gestructureerde documentatie van verschillende klankkleuren, toonschalen, metrische verhoudingen en muzikale vormen. Die beschrijving kan dienen als inspiratie voor de ontwikkeling van een artistieke compsitionele taal of kan gebruikt worden als bronmateriaal voor wetenschappelijk onderzoek rond ethnische muziek. Bijvoorbeeld om (de eventuele

teloorgang van) de eigenheid van orale muziekculturen objectief aan te tonen.

datasetIn de eerste fase van het onderzoek ligt de focus van het onderzoek op één van de meer tastbare parameters: toonhoogte. In etnische muziek is het gebruik van toonhoogte vaak radicaal anders dan westerse muziek die meestal gebaseerd is op de onderverdeling van een octaaf in twaalf gelijke delen. Om toonladders uit
muziek te extraheren en weer te geven werd het software platform Tarsos ontwikkeld. Met Tarsos is het mogelijk om automatische toonladderanlyse uit te voeren op een grote dataset of om manueel een gedetailleerde analyse te verkrijgen van enkele muziekstukken. De cultuuronafhankelijke analysemethode waarvan Tarsos gebruik maakt kan even goed toegepast worden op Indonesische, Westerse of Afrikaanse muziek.

Onze bedoeling is om Tarsos te gebruiken om evoluties in toonladdergebruik te ontdekken in de enorme dataset van het Koninklijk Museum voor Midden-Afrika. Is toonladderdiversiteit in Afrika aan het wegkwijnen onder invloed van Westerse muziek? Zijn er specifieke kenmerken te vinden over eventueel ‘uitgestorven’ muziekculturen? Dit zijn vragen die kaderen in het overkoepelende onderzoeksproject van Olmo Cornelis en waar we met behulp van Tarsos een antwoord op proberen te vinden.

Later krijgen de twee overige muzikale parameters, tempo en timbre, een gelijkaardige behandeling. In de laatste fase van dit toch wel ambitieuze onderzoekproject wordt de relatie tussen de parameters onderzocht.

~ Seminar - Research on Music History and Analysis

This post contains links to genuinely useful software to do signal based audio analysis.

  • Sonic Visualizer: As its name suggests Sonic Visualizer contains a lot different visualisations for audio. It can be used for analysis (pitch,beat,chroma,…) with VAMP-plugins. To quote “The aim of Sonic Visualiser is to be the first program you reach for when want to study a musical recording rather than simply listen to it”. It is the swiss army knife of audio analysis.
  • BeatRoot is designed specifically for one goal: beat tracking. It can be used for e.g. comparing tempi of different performances of the same piece or to track tempo deviation within one piece.
  • Tartini is capable to do real-time pitch analysis of sound. You can e.g. play into a microphone with a violin and see the harmonics you produce and adapt you playing style based on visual feedback. It also contains a pitch deviation measuring apparatus to analyse vibrato.
  • Tarsos is software for tone scale analysis. It is useful to extract tone scales from audio. Different tuning systems can be seen, extracted and compared. It also contains the ability to play along with the original song with a tuned midi keyboard .

Melodic Match is a different beast. It does not work on signal level but processes symbolic audio. More to the point it searches through MusicXML files – which can be created from MIDI-files. See its website for use cases. Melodic Match is only available for Windows.

During a lecture at the University College Gent, Faculty of Music these tools were presented with some examples. The slides and a zip-file with audio samples, slides and software are available for reference. Most of the time was given to Tarsos, the software we developed.

Olmo Cornelis also gave a lecture about his own research and how Tarsos fits in the bigger picture. His presentation and the presentation with audio are also available here.

Sonic Visualizer

Sonic Visualizer

BeatRoot

BeatRoot

Tarsos

Tarsos

Tartini

Tartini

Melodic Match

Melodic Match

 

~ Tarsos Presented at the "Perspectives for Computational Musicology" Symposium

Tarsos Logo Yesterday Tarsos was publicly presented at the symposium Perspectives for Computational Musicology in Amsterdam. The first public presentation of Tarsos, excluding this website. The symposium was organized by the Meertens Institute on the occasion of Peter van Kranenburg’s PhD defense.

The presentation included a live demo of a daily build of Tarsos (a Friday evening build) which worked, surprisingly, without hiccups. The presentation was done by Olmo Cornelis. This was the small introduction:

Tarsos – a Platform for Pitch Analysis of Ethnic Music
Ethnic music is a vulnerable cultural heritage that has received only recently more attention within the Music Information Retrieval community. However, access to ethnic music remains problematic, as this music does not always correspond to the Western concepts of music and metadata that underlie the currently available content-based methods. During this lecture, we like to present our current research on pitch analysis of African music. TARSOS, a platform for analysis, will be presented as a powerful tool that can describe and compare scales with great detail.

To give Tarsos a try ou can start Tarsos using JAVA WebStart or download the executable Tarsos JAR-file. A JAVA 1.5 runtime is required.

~ Rendering MIDI Using Arbitrary Tone Scales

Tarsos can be used to render MIDI files to audio (WAV) files using arbitrary tone scales. This functionallity can be used to (automatically) verify tone scale extraction from audio files. Since I could not find a dataset with audio and corresponding tone scales creating one using MIDI seemed a good idea.

MIDI files can be found in spades, tone scales on the other hand are harder to find. Luckily there is one massive source, the Scala Tone Scale Archive: A large collection of over 3700 tone scales.

Using Scala tone scale files and a midi files a Tone Scale – Audio dataset can be generated. The quality of the audio depends on the (software) synthesizer and the SoundFont used. Tarsos currently uses the Gervill synthesizer. Gervill is a pure Java software synthesizer with support for 24bit SoundFonts and the MIDI tuning standard.

How To Render MIDI Using Arbitrary Tone Scales with Tarsos

A recent version of the JRE needs to be installed on your system if you want to use Tarsos. Tarsos itself can be downloaded in the form of the Tarsos JAR Package.

Currently Tarsos has a Command Line Interface. An example with the files you can find attached:


java -jar tarsos.jar --midi BWV_1007.mid --scala 120.scl --out bach.wav

The result of this command should yield an audio file that sounds like the cello suites of bach in a nonsensical tone scale with steps of 120 cents. Executing tone scale extraction on the generated audo yields the expected result. In the pich class histogram every 120 cents a peak can be found.

To summarize: by rendering audio with MIDI and Scala tone scale files a dataset with tone scale – audio information can be generated and tone scale extraction algorithms can be tested on the fly.

This method also has some limitations. Because audio is rendered there is no (background) noise, no fluctuations in pitch and timbre,… all of which are present in recorded audio. So testing testing tone scale extraction algorithms on recorded audio remains advised.

120 Cents difference

120 Cents difference

 

~ Tone Scale Matching With Tarsos

Tarsos can be used to search for music that uses a certain tone scale or tone interval(s). Tone scales can be defined by a Scala tone scale file or an exemplifying audio file. This text explains how you can use Tarsos for this task.

Search Using Scala Tone Scale Files

Scala files are text files with information about a tone scale. It is used to share and exchange tone scales. The file format originates from the Scala program :

Scala is a powerful software tool for experimentation with musical tunings, such as just intonation scales, equal and historical temperaments, microtonal and macrotonal scales, and non-Western scales. It supports scale creation, editing, comparison, analysis, …

The Scala file format is popular because there is a library with more than 3000 tone scales available on the Scala website.

Tarsos also understands Scala files. It is able to create a pitch class histogram using a gaussian mixture model. A technique described in A. C. Gedik, B.Bozkurt, 2010, "Pitch Frequency Histogram Based Music Information Retrieval for Turkish Music ", Signal Processing, vol.10, pp.1049-1063. (doi:10.106/j.sigpro.2009.06.017).

An example should make things clear. Lets search for an interval of 300 cents or exactly three semitones. A scala file with this interval is easy to define:

1
2
3
4
5
6
7
! example.scl
! An example of a tone interval of 300 cents
Tone interval of 300 cents
2
!
900
1200.0

The next step is to create a histogram with an interval of 300 cents. In the block diagram this step is called “Peak histogram creation”. The Similarity calculation step expects a list of histograms to compare with the newly defined histogram. Feeding the similarity calculation with the western12ET tone scale and a pentatonic Indonesian Slendro tone scale shows that a 300 cents interval is used in the western tone scale but is not available in the Slendro tone scale.

This example only uses scala files, creating histograms is actually not needed: calculating intervals can be done using the scala file itself. This changes when audio files are compared with each other or with scala files.

Search Using Audio Files

When audio files are fed to the algorithm additional steps need to be taken.

  1. First of all pitch detection is executed on the audio file. Currently two pitch extractors are implemented in pure Java, it is also possible to use an external pitch extractor such as aubio
  2. Using pitch annotations a Pitch Histogram is created.
  3. Peak detection on the Pitch Histogram results in a number of peaks, these should represent the distinct pitch classes used in the musical piece.
  4. With the pitch classes a clean peak histogram is created during the Peak Histogram construction phase.
  5. Finally the Peak histogram is matched with other histograms.

The last two steps are the same for audio files or scala files.

Using real audio files can cause dirty histograms. Determining how many distinct pitch classes are used is no trivial task, even for an expert (human) listener. Tarsos should provide a semi-automatic way of peak extraction: a best guess by an algorithm that can easily be corrected by a user. For the moment Tarsos does not allow manual intervention.

Tarsos

To use tarsos you need a recent java runtime (1.6) and the following command line arguments:

1
2
java -jar tarsos.jar rank --detector TARSOS_MPM 
--needle audio.wav --haystack scala.scl other_audio.wav other_scala_file.scl
Data flow audio

Data flow audio

Data flow scala

Data flow scala

300 cents interval

300 cents interval

12ET and 300 cents

12ET and 300 cents

Slendro and 300 cents

Slendro and 300 cents

Realistic Tone scale

Realistic Tone scale

 

~ Tarsos demos

I just finished creating a first release of Tarsos. The release contains several demo applications, some more usefull than other. Tarsos is a work in progress: not all functionality is exposed with the CLI demo applications. The demos should however give a taste of the possibilities. All demo applications follow this pattern:


java -jar tarsos.jar subcommand [--option [argument] ...]

To get help the --help switch can be used. It generates contextual help for either the subcommand or for Tarsos itself.

1
2
java -jar tarsos.jar --help
java -jar tarsos.jar subcommand --help

Detect Pitch


java -jar tarsos.jar detect_pitch --in flute.novib.mf.C5B5.wav

Midi to Audio Using a Scala Tone Scale


java -jar tarsos.jar midi_to_wav --midi satie_gymno1.mid --scala 120.scl

Audio to Scala Tone Scale


java -jar tarsos.jar audio_to_scala --in out.wav

Annotate a File


java -jar tarsos.jar annotate --in out.wav

Pitch table


java -jar tarsos.jar pitch_table

~ Tarsos Spectrogram

Today I created a spectrogram application using Tarsos. The application listens to an audio input, computes an FFT and at the same time calculates pitch. The expected pitch is overlaid on the spectrogram. All this happens real-time and is implemented using JAVA.

spectrum with pitch information (red)

This is the most recent version of the spectrogram implementation in java.

1
2
3
4
5
6
7
8
9
10
float pitch = Yin.processBuffer(buffer, (float) sampleRate);
fft.transform(buffer);
double maxAmplitude = 0;
for (int j = 0; j < buffer.length / 2; j++) {
        double amplitude = buffer[j] * buffer[j] + buffer[j + 
                buffer.length/2] * buffer[j+ buffer.length/2];
        amplitude = Math.pow(amplitude, 0.5);
        colorIndexes[j] = amplitude;
        maxAmplitude = Math.max(amplitude, maxAmplitude);
}

If you want to test it yourself download the spectrogram jar package and execute:


java -jar spectrogram.jar

~ YIN Pitch Tracker in JAVA

To make Tarsos more portable I wrote a pitch tracker in pure JAVA using the YIN algorithm based on the implementation in C of aubio. The implementation also uses some code written by Karl Helgasson and Teun de Lange.

It can be used to perform real time pitch detection or to analyse files. To use it as a real time pitch detector just start the JAR-file by double clicking. To analyse a file execute one of the following. The first results in a list of annotations (text), the second shows the annotations graphically.

1
2
java -jar pitch_detector_yin.jar  flute.novib.mf.C5B5.wav
java -jar pitch_detector_yin.jar  --file flute.novib.mf.C5B5.wav

The provided flute sample is from The Musical Samples library of the University of Iowa and converted to mono wav. The source code of the pitch tracker can be found below.

Screenshot

Screenshot

 

Development and Application of MIR Techniques on Contemporary Classical and Ethnic Music

Summary

While practising ethnomusicological research on a large dataset we try to develop useful software for the (ethno)musicological research community. We want to create user friendly software that provides culture independent processing of MIR-features such as pitch, tempo and timbre.

For the moment we are focusing on pitch related information such as tone scales. Tone scales of different cultures are hard to compare using a universal language. The typical sound of a musical tradition is based on its individual characteristics, its own language. Most pitch related software is geared towards tonal, well-tempered music and uses western concepts, jargon. The idea behind Tarsos is to use pitch tracking algorithms to identify defining tone scale features and to visualize, export those features in a culture independent manner. E.g. by using pitch class histograms.

In the following years tempo and timbre will receive a similar treatment.

Keywords

Pitch tracking – Sound Analysis – Culture Independent Processing of MIR annotations – Computational Ethnomusicology.

Partners

There is also a project page available in the research information system: Toepassing van Music Information Retrieval technieken op hedendaags klassieke en etnische muziek

IPEM logo

IPEM logo

Royal Museum for Central Africa logo

Royal Museum for Central Africa logo

Faculty of Music logo

Faculty of Music logo

 
 

~ The Power of the Pentatonic Sca

The following video shows Bobby McFerrin demonstrating the power of the pentatonic scale. It is a fas ...
Read more »

~ Tarsos at 'WASPAA 2011'

During the the demo session of the IEEE Workshop on Applications of Signal Processing to Audio and Ac ...
Read more »

~ Tarsos at 'ISMIR 2011'

A paper about Tarsos was submitted for review at the 12th International Society for Music Information ...
Read more »