What

What we want to achieve

We want to provide the community with user friendly software that provides culture independent views on MIR-features.

For the moment we are focusing on pitch related information such as tone scales.

Learn more »

How

The method we are using

dataset While practising ethnomusicological research on a large dataset we try to develop useful software called Tarsos for the (ethno)musicological research community.
Learn more »

Who

Who we are

Olmo Cornelis.
Musicologist, composer.

Joren Six.
Computer Scientist.

Bio »
 

Try Tarsos

To run Tarsos you need a recent Java runtime on your machine.


Download Tarsos

Try Tarsos


Watch the screencast | Fork us on Github | Consult the API documentation | | Find our official website | Read the manual | Cite our work

~ TarsosLSH - Locality Sensitive Hashing (LSH) in Java

TarsosLSH is a Java library implementing Locality-sensitive Hashing (LSH), a practical nearest neighbour search algorithm for multidimensional vectors that operates in sublinear time. It supports several Locality Sensitive Hashing (LSH) families: the Euclidean hash family (L2), city block hash family (L1) and cosine hash family. The library tries to hit the sweet spot between being capable enough to get real tasks done, and compact enough to serve as a demonstration on how LSH works. It relates to the Tarsos project because it is a practical way to search for and compare musical features.

Quickly Getting Started with TarsosLSH

Head over to the TarsosLSH release repository and download the latest TarsosLSH library. Consult the TarsosLSH API documentation. If you, for some reason, want to build from source, you need Apache Ant and git installed on your system. The following commands fetch the source and build the library and example jars:

git clone https://JorenSix@github.com/JorenSix/TarsosLSH.git
cd TarsosLSH/build
ant  #Builds the core TarsosLSH library
ant javadoc #build the API documentation

When everything runs correctly you should be able to run the command line application, and have the latest version of the TarsosLSH library for inclusion in your projects. Also, the Javadoc documentation for the API should be available in TarsosLSH/doc. Drop me a line if you use TarsosLSH in your project. Always nice to hear how this software is used.

The fastest way to get something on your screen is executing this on your command line: java - jar TarsosLSH.jar this lets LSH run on a random data set. The full reference of the command line application is included below:

Name
	TarsosLSH: finds the nearest neighbours in a data set quickly, using LSH.
Synopsis    
	java - jar TarsosLSH.jar [options] dataset.txt queries.txt 
Description
	Tries to find nearest neighbours for each vector in the 
	query file, using Euclidean (L2) distance by default.
	
	Both dataset.txt and queries.txt have a similar format: 
	an optional identifier for the vector and a list of N 
	coordinates (which should be doubles).

	[Identifier] coord1 coord2 ... coordN
	[Identifier] coord1 coord2 ... coordN
	
	For an example data set with two elements and 4 dimensions:
	
	Hans 12 24 18.5 -45.6
	Jane 13 19 -12.0 49.8
	
	Options are:
	
	-f cos|l1|l2 
		Defines the hash family to use:
			l1	City block hash family (L1)
			l2	Euclidean hash family(L2)
			cos	Cosine distance hash family
	-r radius 
		Defines the radius in which near neighbours should
		be found. Should be a double. By default a reasonable
		radius is determined automatically.
	-h n_hashes
		An integer that determines the number of hashes to 
		use. By default 4, 32 for the cosine hash family.
	-t n_tables
		An integer that determines the number of hash tables,
		each with n_hashes, to use. By default 4.
	-n n_neighbours
		Number of neighbours in the neighbourhood, defaults to 3.
	-b 
		Benchmark the settings. 
	--help 
		Prints this helpful message.
Examples
	Search for nearest neighbours using the l2 hash family with a radius of 500
	and utilizing 5 hash tables, each with 3 hashes.
	
	java - jar TarsosLSH.jar -f l2 -r 500 -h 3 -t 5 dataset.txt queries.txt

Source Code Organization

The source tree is divided in three directories:

  • src contains the source files of the core DSP libraries.
  • test contains unit tests for some of the DSP functionality.
  • build contains ANT build files. Either to build Java documentation or runnable JAR-files for the example applications.

Further Reading

This section includes a links to resources used to implement this library.

~ Flanger Audio Effect in Java

The DSP library for Taros, aptly named TarsosDSP, now includes an example demonstrating the flanging audio effect. Flanging, essentialy mixing the signal with a varying delay of itself, produces an interesting interference pattern.

Pitch estimation synthesizer

The flanging example works on wav-files or on input from microphone. Try it yourself, download
Flanging.jar, the executable jar file. Below you can check what flanging sounds like with various parameters.

The source code of the Java implementation can be found on the TarsosDSP github page.

~ TarsosDSP Christmas Edition: Jingle Cats

The DSP library for Taros, aptly named TarsosDSP, now includes an example showing how to synthesize cat sounds. The inspration came from this youtube video

To hear what exactly it does, listen to the following audio example.

There is also a command line interface, the following command does

java -jar Catify-latest.jar in.mid

 _______                       _____   _____ _____  
|__   __|                     |  __ \ / ____|  __ \ 
   | | __ _ _ __ ___  ___  ___| |  | | (___ | |__) |
   | |/ _` | '__/ __|/ _ \/ __| |  | |\___ \|  ___/ 
   | | (_| | |  \__ \ (_) \__ \ |__| |____) | |     
   |_|\__,_|_|  |___/\___/|___/_____/|_____/|_|     
                                                    
----------------------------------------------------
Name:
	TarsosDSP catify'er
----------------------------------------------------
Synopsis:
	java -jar Catify-latest.jar input.mid
----------------------------------------------------
Description:
	

The source code of the Java implementation of the catify’er can be found on the TarsosDSP github page.

~ TarsosDSP Pitch Estimation Synthesizer

The DSP library for Taros, aptly named TarsosDSP, now includes an example showing how to synthesize pitch estimations. The goal of the example is to show which errors are made by different pitch detectors.

Pitch estimation synthesizer

To test the application, download and execute the Resynthesizer.jar file and load an audio file. For the moment only 44.1kHz mono wav is allowed. To hear what exactly it does, compare the following two audio fragments:


There is also a command line interface, the following command does pitch tracking, and follows the envelope of in.wav and immediately plays it on the default audio device. If you want to save the audio, see the command line options. The flute example is provided for your convenience.

java -jar Resynthesizer-latest.jar in.wav

 _______                       _____   _____ _____  
|__   __|                     |  __ \ / ____|  __ \ 
   | | __ _ _ __ ___  ___  ___| |  | | (___ | |__) |
   | |/ _` | '__/ __|/ _ \/ __| |  | |\___ \|  ___/ 
   | | (_| | |  \__ \ (_) \__ \ |__| |____) | |     
   |_|\__,_|_|  |___/\___/|___/_____/|_____/|_|     
                                                    
----------------------------------------------------
Name:
	TarsosDSP resynthesizer
----------------------------------------------------
Synopsis:
	java -jar CommandLineResynthesizer.jar [--detector DETECTOR] [--output out.wav] [--combined combined.wav] input.wav
----------------------------------------------------
Description:
	Extracts pitch and loudnes from audio and resynthesises the audio with that information.
	The result is either played back our written in an output file. 
	There is als an option to combine source and synthezized material
	in the left and right channels of a stereo audio file.


	input.wav		a readable wav file.

	--output out.wav		a writable file.

	--combined combined.wav		a writable output file. One channel original, other synthesized.
	--detector DETECTOR	defaults to FFT_YIN or one of these:
				YIN
				MPM
				FFT_YIN
				DYNAMIC_WAVELET
				AMDF


The source code of the Java implementation of the synthesizer can be found on the TarsosDSP github page.

~ Phase Vocoding: Time Stretching and Pitch Shifting with TarsosDSP Java

The DSP library for Taros, aptly named TarsosDSP, now includes an implementation of a pitch shifting algorithm (as of version 1.4) and a time stretching algorithm. Combined, the two can be used for something like phase vocoding. With a phase vocoder you can load an audio snippet, change the pitch and duration and e.g. create a library of snippets. E.g. by recording one piano key stroke, it is possible to generate two octaves of samples of different lengths, and use those in stead of synthesized samples. The following example application shows exactly that, implemented in the java programming language.

The example application below shows how to pitch shift and time stretch a sample to create a sample library with the TarsosDSP library.

Pitch shifting in Java

Find your oven fresh baked binaries at the TarsosDSP Release Repository.

~ Tarsos 1.0: Transcription Features

Today marks the reslease of Tarsos 1.0 . The new Tarsos release contains practical transcription features. As can be seen in the screenshot below, a time stretching feature makes it easy to loop a certain audio fragment while it is playing in a slow tempo. The next loop can be played with by pressing the n key, the one before by pressing b.

Since the pitch classes can be found in a song, and there is a feature that lets you play a MIDI keyboard in the tone scale of the song under analysis, transcription of ethnic music is made a lot easier.

The new release of Tarsos can be found in the Tarsos release repository. From now on, nightly releases are uploaded there automatically.

~ Pitch Shifting - Implementation in Pure Java with Resampling and Time Stretching

The DSP library for Taros, aptly named TarsosDSP, now includes an implementation of a pitch shifting algorithm (as of version 1.4). The goal of pitch shifting is to change the pitch of a piece of audio without affecting the duration. The algorithm implemented is a combination of resampling and time stretching. Resampling changes the pitch of the audio, but affects the total duration. Consecutively, the duration of the audio is stretched to the original (without affecting pitch) with time stretching. The result is very similar to phase vocoding.

The example application below shows how to pitch shift input from the microphone in real-time, or pitch shift a recorded track with the TarsosDSP library.

Pitch shifting in Java

To test the application, download and execute the PitchShift.jar file and load an audio file. For the moment only 44.1kHz mono wav is allowed. To get started you can try this piece of audio.

There is also a command line interface, the following command lowers the pitch of in.wav by two semitones.

java -jar in.wav out.wav -200

----------------------------------------------------
 _______                       _____   _____ _____  
|__   __|                     |  __ \ / ____|  __ \ 
   | | __ _ _ __ ___  ___  ___| |  | | (___ | |__) |
   | |/ _` | '__/ __|/ _ \/ __| |  | |\___ \|  ___/ 
   | | (_| | |  \__ \ (_) \__ \ |__| |____) | |     
   |_|\__,_|_|  |___/\___/|___/_____/|_____/|_|     
                                                    
----------------------------------------------------
Name:
	TarsosDSP Pitch shifting utility.
----------------------------------------------------
Synopsis:
	java -jar PitchShift.jar source.wav target.wav cents
----------------------------------------------------
Description:
	Change the play back speed of audio without changing the pitch.

		source.wav	A readable, mono wav file.
		target.wav	Target location for the pitch shifted file.
		cents		Pitch shifting in cents: 100 means one semitone up, 
				-100 one down, 0 is no change. 1200 is one octave up.

The resampling feature was implemented with libresample4j by Laszlo Systems. libresample4j is a Java port of Dominic Mazzoni’s libresample 0.1.3, which is in turn based on Julius Smith’s Resample 1.7 library.

~ CIM 2012 - Revealing and Listening to Scales From the Past; Tone Scale Analysis of Archived Central-African Music Using Computational Means

Logo Universiteit UtrechtWhat follows is about the Conference on Interdisciplinary Musicology and the 15th international Conference of the Gesellschaft fur Musikfoschung. First this text will give information about our contribution to CIM2012: Revealing and Listening to Scales From the Past; Tone Scale Analysis of Archived Central-African Music Using Computational Means and then a number of highlights of the conference follow. The joint conference took place from the 4th to the 8th of september 2012.

In 2012, CIM will tackle the subject of History. Hosted by the University of Göttingen, whose one time music director Johann Nikolaus Forkel is widely regarded as one of the founders of modern music historiography, CIM12 aims to promote collaborations that provoke and explore new methods and methodologies for establishing, evaluating, preserving and communicating knowledge of music and musical practices of past societies and the factors implicated in both the preservation and transformation of such practices over time.

Revealing and Listening to Scales From the Past; Tone Scale Analysis of Archived Central-African Music Using Computational Means

Our contribution ton CIM 2012 is titled Revealing and Listening to Scales From the Past; Tone Scale Analysis of Archived Central-African Music Using Computational Means. The aim was to show how tone scales of the past, e.g. organ tuning, can be extracted and sonified. During the demo special attention was given to historic Central African tuning systems. The presentation I gave is included below and or available for download

Highlights

What follows are some personal highlights for the Conference on Interdisciplinary Musicology and the 15th international Conference of the Gesellschaft fur Musikfoschung. The joint conference took place from the 4th to the 8th of september 2012.

The work presented by Rytis Ambrazevicius et al. Modal changes in traditional Lithuanian singing: Diachronic aspect has a lot in common with our research, it was interesting to see their approach. Another highlight of the conference was the whole session organized by Klaus-Peter Brenner around Mbira music.

Rainer Polak gave a talk titled ‘Swing, Groove and Metre. Asymmetric Feels, Metric Ambiguity and Metric Transformation in African Musics’. He showed how research about rhythm in jazz research, music theory and empirical musicology ( amongst others) could be bridged and applied to ethnic music.

The overview Eleanore Selfridge-Field gave during her talk Between an Analogue Past and a Digital Future: The Evolving Digital Present was refreshing. She had a really clear view on all the different ways musicology and digital media can benifit from each-other.

From the concert programme I found two especially interesting: the lecture-performance by Margarete Maierhofer-Lischka and Frauke Aulbert of Lotofagos, a piece by Beat Furrer and Burdocks composed and performed by Christian Wolff and a bunch of enthusiastic students.

~ Analytical Approaches To World Music - Microtonal Scale Exploration in Central Africa

At the 2012 AAWM conference we presented a way to explore tone scales in the music of Central Africa. Since the audience consisted of (ethno)musicologists, the main focus of the presentation was on the applicication part, the technical aspects were only briefly mentioned.

The extended abstract can be consulted: Towards the tangible: microtonal scale exploration in Central-African music

The conference program itself was very diverse and interesting.

~ Guest Lecture at MIT - Ethnic Music Analysis: Challenges & Opportunities - Tarsos as a Case Study

Thursday the 3th of May I gave a guest lecture titled ‘Ethnic Music Analysis: Challenges & Opportunities’ it featured Tarsos as a Case Study. The goal was to identify the difficulties when dealing with ethnic music and to show a possible approach, the approach implemented by Tarsos.

The invitation to give the guest lecture came from Michael Cuthbert who is one of the driving forces behind music21. The audience was a small group of double majors in both musicology and computer science: the ideal profile to gather useful feedback.

~ Démonstration de Tarsos

Nous avons creé une video pour expliquer des possibilités de Tarsos, et maintenant en français.

~ Audio Time Stretching - Implementation in Pure Java Using WSOLA

The DSP library for Taros, aptly named TarsosDSP, now includes an implementation of a time stretching algorithm. The goal of time stretching is to change the duration of a piece of audio without affecting the pitch. The algorithm implemented is described in An Overlap-add Technique Based On Waveform Similarity (WSOLA) for High Quality Time-Scale Modification of Speech.

Time Stretching (WSOLA) in Java

To test the application, download and execute the WSOLA jar file and load an audio file. For the moment only 44.1kHz mono wav is allowed. To get started you can try this piece of audio.

There is also a command line interface, the following command doubles the speed of in.wav:

java -jar TimeStretch.jar in.wav out.wav 2.0

 _______                       _____   _____ _____  
|__   __|                     |  __ \ / ____|  __ \ 
   | | __ _ _ __ ___  ___  ___| |  | | (___ | |__) |
   | |/ _` | '__/ __|/ _ \/ __| |  | |\___ \|  ___/ 
   | | (_| | |  \__ \ (_) \__ \ |__| |____) | |     
   |_|\__,_|_|  |___/\___/|___/_____/|_____/|_|     
                                                    
----------------------------------------------------
Name:
	TarsosDSP Time stretch utility.
----------------------------------------------------
Synopsis:
	java -jar TimeStretch.jar source.wav target.wav factor
----------------------------------------------------
Description:
	Change the play back speed of audio without changing the pitch.

		source.wav	A readable, mono wav file.
		target.wav	Target location for the time stretched file.
		factor		Time stretching factor: 2.0 means double the length, 0.5 half. 1.0 is no change.

The source code of the Java implementation of WSOLA can be found on the TarsosDSP github page.

~ A Robust Audio Fingerprinter Based on Pitch Class Histograms - Applications for Ethnic Music Archives

For the Folk Music Analyisis (FMA) 2012 conference we (Olmo Cornelis and myself), wrote a paper presenting a new acoustic fingerprint scheme based on pitch class histograms.

The aim of acoustic fingerprinting is to generate a small representation of an audio signal that can be used to identify or recognize similar audio samples in a large audio set. A robust fingerprint generates similar fingerprints for perceptually similar audio signals. A piece of music with a bit of noise added should generate an almost identical fingerprint as the original. The use cases for audio fingerprinting or acoustic fingerprinting are myriad: detection of duplicates, identifying songs, recognizing copyrighted material,…

Using a pitch class histogram as a fingerprint seems like a good idea: it is unique for a song and it is reasonably robust to changes of the underlying audio (length, tempo, pitch, noise). The idea has probably been found a couple of times independently, but there is also a reference to it in the literature, by Tzanetakis, 2003: Pitch Histograms in Audio and Symbolic Music Information Retrieval:

Although mainly designed for genre classification it is possible that features derived from Pitch Histograms might also be applicable to the problem of content-based audio identification or audio fingerprinting (for an example of such a system see (Allamanche et al., 2001)). We are planning to explore this possibility in the future.

Unfortunately they never, as far as I know, did explore this possibility, and I also do not know if anybody else did. I found it worthwhile to implement a fingerprinting scheme on top of the Tarsos software foundation. Most elements are already available in the Tarsos API: a way to detect pitch, construct a pitch class histogram, correlate pitch class histograms with a pitch shift,… I created a GUI application which is presented here. It is, probably, acoustic / audio fingerprinting system based on pitch class histograms.

Audio fingerprinter based on pitch class histograms

It works using drag and drop and the idea is to find a needle (an audio file) in a hay stack (a large amount of audio files). For every audio file in the haystack and for the needle pitch is detected using an optimized, for speed, MPM implementation. A pitch class histogram is created for each file, the histogram for the needle is compared with each histogram in the hay stack and, hopefully, the needle is found in the hay stack.

An experiment was done on the audio collection of the museum for Central Africa. A test dataset was generated using SoX with the following Ruby script. The raw results were parsed with another Ruby script. With the data a spreadsheet with the results was created (OpenOffice.org format). Those results are mentioned in the paper.

You can try the system yourself by downloading the fingerprinter.

Drag and drop UI

Drag and drop UI

 

~ Kinderuniversiteit - Muziek onder de microscoop!

Zondag 18 december 2011 gaf ik een workshop voor de Gentse kinderuniversiteit. Het thema van de kinderuniversiteit was Muziek onder de microscoop. De teaser voor de workshop is hier te vinden:

Logo kinderuniversiteitWORKSHOP – Muziek (ont)luisteren op de computer
Is het mogelijk om piano te spelen op een tafel? Kan een computer luisteren naar muziek en er van genieten? Wat is muziek eigenlijk, en hoe werkt geluid?
Tijdens deze workshop worden de voorgaande vragen beantwoord met enkele computerprogramma’s!

Concreet worden enkele componenten van geluid (en bij uitbreiding, muziek) gedemonstreerd met computerprogrammaatjes gemaakt in het conservatorium:

  • Geluidssterkte: een decibel-meter met een bepaalde drempelwaarde. Probeer zo luid mogelijk te doen en zie hoe moeilijk het is om, eens een bepaald niveau bereikt is, in decibel te stijgen.
  • Toonhoogte: een klein spelletje om toonhoogte aan te tonen. Probeer zo juist mogelijk te zingen of te fluiten en vergelijk je score.
  • Percussie: dit programma reageert op handgeklap. Hoe kan je het onderscheid maken tussen bijvoorbeeld een fluittoon en handgeklap?

De foto’s hieronder geven een sfeerbeeld.

 

~ Robust Audio Fingerprinting with Tarsos and Pitch Class Histograms

The aim of acoustic fingerprinting is to generate a small representation of an audio signal that can be used to identify or recognize similar audio samples in a large audio set. A robust fingerprint generates similar fingerprints for perceptually similar audio signals. A piece of music with a bit of noise added should generate an almost identical fingerprint as the original. The use cases for audio fingerprinting or acoustic fingerprinting are myriad: detection of duplicates, identifying songs, recognizing copyrighted material,…

Using a pitch class histogram as a fingerprint seems like a good idea: it is unique for a song and it is reasonably robust to changes of the underlying audio (length, tempo, pitch, noise). The idea has probably been found a couple of times independently, but there is also a reference to it in the literature, by Tzanetakis, 2003: Pitch Histograms in Audio and Symbolic Music Information Retrieval:

Although mainly designed for genre classification it is possible that features derived from Pitch Histograms might also be applicable to the problem of content-based audio identification or audio fingerprinting (for an example of such a system see (Allamanche et al., 2001)). We are planning to explore this possibility in the future.

Unfortunately they never, as far as I know, did explore this possibility, and I also do not know if anybody else did. I found it worthwhile to implement a fingerprinting scheme on top of the Tarsos software foundation. Most elements are already available in the Tarsos API: a way to detect pitch, construct a pitch class histogram, correlate pitch class histograms with a pitch shift,… I created a GUI application which is presented here. It is, probably, the first open source acoustic / audio fingerprinting system based on pitch class histograms.

Audio fingerprinter based on pitch class histograms

It works using drag and drop and the idea is to find a needle (an audio file) in a hay stack (a large amount of audio files). For every audio file in the haystack and for the needle pitch is detected using an optimized, for speed, Yin implementation. A pitch class histogram is created for each file, the histogram for the needle is compared with each histogram in the hay stack and, hopefully, the needle is found in the hay stack.

Unfortunately I do not have time for rigorous testing (by building a large acoustic fingerprinting data set, or an other decent test bench) but the idea seems to work. With the following modifications, done with audacity effects the needle was still found a hay stack of 836 files :

  • A 10% speedup
  • 15 and 30 seconds removed form the needle (a song of 4 minutes 12 seconds)
  • White noise added
  • Reversed the audio (This is, I believe, a rather unique property of this fingerprinting technique)
  • GSM reencoded

The following modifications failed to identify the correct song:

  • A one semitone pitch shift
  • A two semitone pitch shift
  • 60 seconds removed from the needle

The original was also found. No failure analysis was done. The hay stack consists of about 100 hours of western pop, the needle is also a western pop song. If somebody wants to pick up this work or has an acoustic fingerprinting data set or drop me a line at .

The source code is available, as always, on the Tarsos GitHub page.

Audio Fingerprinting Results

Audio Fingerprinting Results

Audio Fingerprinting Query

Audio Fingerprinting Query

Large scale results

Large scale results

 

~ Tarsos at 'ISMIR 2011'

Tarsos LogoA paper about Tarsos was submitted for review at the 12th International Society for Music Information Retrieval Conference which will be held in Miami. The paper Tarsos – a Platform to Explore Pitch Scales in Non-Western and Western Music was reviewed and accepted, it will be published in this year’s proceedings of the ISMIR conference. It can be read below as well.

An oral presentation about Tarsos is going to take place Tuesday, the 25 of October during the afternoon, as can be seen on the ISMIR preliminary program schedule.

If you want to cite our work, please use the following data:

1
2
3
4
5
6
7
8
9
10
@inproceedings{six2011tarsos,
  author     = {Joren Six and Olmo Cornelis},
  title      = {Tarsos - a Platform to Explore Pitch Scales 
                in Non-Western and Western Music},
  booktitle  = {Proceedings of the 12th International 
                Society for Music Information Retrieval Conference,
                ISMIR 2011},
  year       = {2011},
  publisher  = {International Society for Music Information Retrieval}
}

~ TarsosTranscoder

Tarsos Transcoder is a library to transcode audio with JAVA.

Downloads and more info on http://tarsos.0110.be/tag/TarsosTranscoder

It uses (platform dependent) FFmpeg binaries in the background. It is a fork of JAVE (Java Audio and Video Encoder) by Carlo Pelliccia (www.sauronsoftware.it).

Tarsos Transcoder focuses only on audio and it is compatible with more, and more recent FFmpeg binaries and it less dependent on text output of the different binaries. The interface is also simplified. It falls back to use the ffmpeg binary in the system path, if one is present, therefore it supports platforms for which no binary is provided within the release.

Getting Started

If you have Apache Ant and git installed on your system the following commands get you started quickly:

git clone https://JorenSix@github.com/JorenSix/TarsosTranscoder.git
cd TarsosTranscoder/build
ant #Compiles and builds the core TarsosTranscoder library
ant javadoc #Creates the javadoc documentation in TarsosTranscoder/doc
java -jar tarsos_transcoder-1.0.jar ../audio/input/tone/tone_10s.wav test.flac FLAC_MONO_44KHZ #Test wav to flac transcoding

If you want to use the transcoder from within Java you need to call Transcoder. It is as simple as:

Transcoder.transcode("foo.mp3","foo.wav",DefaultAttributes.WAV_PCM_S16LE_STEREO_44KHZ);

FFmpeg can encode to a lot of audio formats and can decode even more.

Inner workings

Tarsos Transcoder tries to find an FFmpeg binary in the path of the system. If it does not find one it tries to copy a binary for the current platform. Tarsos Transcoder contains three binaries: one for MAC OS X, one for Linux (x86) and one for windows. Tarsos Transcoder has been tested on:

  • MAC OS X 10.6
  • Windows 7
  • Ubuntu linux 10.10 ARM
  • Ubuntu Linux 10.04 x86_64

It will probably work most of the time.

Alternative Binaries

If the TarsosTranscoder does not include binaries for you platform, install ffmpeg and add the ffmpeg executable to your system path. It will be found and used by TarsosTranscoder automatically.

Alternatively, providing binaries for your (unsupported) platform can be done by implementing FFMPEGLocator. The PickMe method should yield true on your platform and copy e.g. an FFmpeg binary to a temporary directory.

Lisence

This software is licensed under GPL, TarsosTranscoder is based on JAVE (GPL).

Credits

JAVE (Java Audio and Video Encoder) by Carlo Pelliccia – www.sauronsoftware.it

FFmpeg: this uses libraries from the FFmpeg project under the LGPLv2.1

This product includes software developed by The Apache Software Foundation. It uses the Apache Commons Exec library, licensed under the Apache License Version 2.0

TarsosTranscoder is used by Tarsos, Tarsos is developed at University College Ghent, Faculty of Music

~ TarsosLSH - Locality Sensitive Hashing (LSH) in Java

TarsosLSH is a Java library implementing Locality-sensitive Hashing (LSH), a practical nearest neighbour search algorithm for multidimensional vectors that operates in sublinear time. It supports several Locality Sensitive Hashing (LSH) families: the Euclidean hash family (L2), city block hash family (L1) and cosine hash family. The library tries to hit the sweet spot between being capable enough to get real tasks done, and compact enough to serve as a demonstration on how LSH works. It relates to the Tarsos project because it is a practical way to search for and compare musical features.

Quickly Getting Started with TarsosLSH

Head over to the TarsosLSH release repository and download the latest TarsosLSH library. Consult the TarsosLSH API documentation. If you, for some reason, want to build from source, you need Apache Ant and git installed on your system. The following commands fetch the source and build the library and example jars:

git clone https://JorenSix@github.com/JorenSix/TarsosLSH.git
cd TarsosLSH/build
ant  #Builds the core TarsosLSH library
ant javadoc #build the API documentation

When everything runs correctly you should be able to run the command line application, and have the latest version of the TarsosLSH library for inclusion in your projects. Also, the Javadoc documentation for the API should be available in TarsosLSH/doc. Drop me a line if you use TarsosLSH in your project. Always nice to hear how this software is used.

The fastest way to get something on your screen is executing this on your command line: java - jar TarsosLSH.jar this lets LSH run on a random data set. The full reference of the command line application is included below:

Name
	TarsosLSH: finds the nearest neighbours in a data set quickly, using LSH.
Synopsis    
	java - jar TarsosLSH.jar [options] dataset.txt queries.txt 
Description
	Tries to find nearest neighbours for each vector in the 
	query file, using Euclidean (L2) distance by default.
	
	Both dataset.txt and queries.txt have a similar format: 
	an optional identifier for the vector and a list of N 
	coordinates (which should be doubles).

	[Identifier] coord1 coord2 ... coordN
	[Identifier] coord1 coord2 ... coordN
	
	For an example data set with two elements and 4 dimensions:
	
	Hans 12 24 18.5 -45.6
	Jane 13 19 -12.0 49.8
	
	Options are:
	
	-f cos|l1|l2 
		Defines the hash family to use:
			l1	City block hash family (L1)
			l2	Euclidean hash family(L2)
			cos	Cosine distance hash family
	-r radius 
		Defines the radius in which near neighbours should
		be found. Should be a double. By default a reasonable
		radius is determined automatically.
	-h n_hashes
		An integer that determines the number of hashes to 
		use. By default 4, 32 for the cosine hash family.
	-t n_tables
		An integer that determines the number of hash tables,
		each with n_hashes, to use. By default 4.
	-n n_neighbours
		Number of neighbours in the neighbourhood, defaults to 3.
	-b 
		Benchmark the settings. 
	--help 
		Prints this helpful message.
Examples
	Search for nearest neighbours using the l2 hash family with a radius of 500
	and utilizing 5 hash tables, each with 3 hashes.
	
	java - jar TarsosLSH.jar -f l2 -r 500 -h 3 -t 5 dataset.txt queries.txt

Source Code Organization

The source tree is divided in three directories:

  • src contains the source files of the core DSP libraries.
  • test contains unit tests for some of the DSP functionality.
  • build contains ANT build files. Either to build Java documentation or runnable JAR-files for the example applications.

Further Reading

This section includes a links to resources used to implement this library.

 

~ Tarsos 1.0: Transcription Featu

Today marks the reslease of Tarsos 1.0 . The new Tarsos release contains practical transcription feat ...
Read more »

~ Pitch Shifting - Implementation

The DSP library for Taros, aptly named TarsosDSP, now includes an implementation of a pitch shifting ...
Read more »

~ ICMC 2012 - Sound to Scale to S

At this years ICMC Conference, ICMC 2012 we presented a paper describing a way to experiment with ton ...
Read more »