[Clam-devel] GSoC: Enhancing chord detection
Roman Goj
roman.goj at gmail.com
Sun May 27 15:04:41 PDT 2007
Hi everyone!
I seem to be unable to write short mail, so a table of contents for
quick usage:
I) Officially introducing myself to the mailing list in general
II) What I've been trying to do with my project lately
III) What I really did and how I feel it worked out (the reason for this
mail)
IV) What next?
V) A technical question about editing the wiki pages
I) I'm one of the Google Summer of Code students happily working for
CLAM this year :) I'll take a few words to introduce myself to anyone
who doesn't know this yet:
I'm a student of Physics at Warsaw University, specializing in Medical
Physics - EEG signal analysis mostly. I also love music... and thus CLAM
is a great opportunity to combine the two - musical signals analysis :)
Our faculty uses a time frequency transform called Matching Pursuit,
which decomposes a given signal into atoms, usually gabor atoms (sin/cos
with a Gaussian envelope), the principle is basically the same as for
wavelet transforms, only not with wavelets... I thought it would be a
nice idea to try using this transform for chord extraction/tone
extraction/instrument recognition, perhaps using atoms different than
standard gabors (like extracts of recordings of real musical
instruments)... And this is what I planned to implement in CLAM this
summer...
II) And this brings us to what I wanted to write - as David suggested
these past couple of days I've been working on some quick and dirty
tests of MP with music (written in Matlab for now). There's some fear
that this algorithm might not work as expected or that it might not be
that easy to make it work that way (there's some research in that
direction right now, but definitely no working commercial-class
implementations) and that it might be incredibly slow... so - the need
for tests...
... which are inconclusive IMHO :(
III) I tried basically two/three things:
1. Matching pursuit using instrument atoms
2. MP using gabors/harmonic gabors
Ad 1.
I took recordings of seven piano notes, composed a 5 second test signal:
time [s] notes
0-1 C
1-2 D + F
2-4 C + E(quiet) + G
4-5 G (quiet)
Then I made atoms out of the piano note recordings taking small parts of
the onset and small parts of the signal half a second after the onset.
Thus I got two atoms for each note - one onset atom and one transient
atom. I then took a Gaussian envelope and applied it to all the 14 (7*2)
atoms. And then I used these atoms to decompose the signal... getting
pretty good results - all the notes in the signal were detected, only
there were false detections in neighbouring notes (this could be easily
remedied by some easy post processing of the results I think).
One huge downside though - these 5 seconds took about 1.5h hours to
decompose into 300 hundred atoms (well, it could've probably stopped at
about 200 and the results would be the same)... but I'm hoping optimized
C/C++ code (as opposed to unoptimized Matlab scripts), perhaps using the
MP ToolKit (MPTK), should do this many times faster.
One might also say that this is unrealistic - I took the same recordings
to make both the atoms and the test signal... so no wonder it worked...
well, yes - but I can think of at least one application of specifically
this - sampling your own non-MIDI instrument (or voice?) and playing it
(monophonic, polyphonic, vibrato, take what you want) through a
microphone into MIDI signals, should be nice :)
Ad 2. Well this was a failure :( Admittedly I did not spend as much time
on this - I got too excited with the results from the first try (forgot
to eat my dinner today... this is going to be an exciting summer ;) ).
For this experiment I took 20s out of the example CLAM Annotator song -
Debaser-CoffeeSmell.mp3 and tried decomposing it using mono-frequency
gabor atoms and harmonic gabor atoms (with frequencies f0, 2*f0,
3*f0)... and mostly what I see in the results are notes right next to
the notes that should be there (the right ones absent) - like in short
time Fourier transform with too short a window - one sees the energy
near one's notes, without being able to perfectly pinpoint the right
ones... Well perhaps this is a bit inconclusive, since I didn't have
enough time to let the scripts work longer on the signal, perhaps over
the night better results will be born (all excited about tomorrow
morning - I'll be dreaming in gabors tonight ;) ). But for now - this
test is a failure :(
IV) Now a decision has to be made - whether I should continue with MP or
try different techniques for improving chord extraction... any advice?
My arguments for MP:
* it'd be useful not only for musical analysis, you could define any
atoms and look for them in the signal
* existing GPL implementation - MPTK, should let me focus on the details
that should turn the failed trial into a successful one, instead of
delving into implementation details
* well, "it's cool" ;)
...my arguments against MP:
* I've spent so much time these couple of weeks reading papers 'bout MP
perhaps it would be healthier to try something new during the summer ;)
V) I also have a question - I would've put all that on the wiki pages,
but I am unable to create an account - I'm supposed to identify some
images... only none are displayed and I can't get them by copying their
addresses either... is it me? my browsers (firefox (iceweasel as of
late...), mozilla (iceape...) navigator, konqueror)? or something else?
Sorry for taking so much of your time, hope I managed to explain what
I've been doing and you don't consider that a waste of precious coding
time :) And I'm open to suggestions/decisions on whether I should
continue doing this through the summer or try something different.
Three minutes after midnight where I live - so it's 28 May, let the SoC
begin then :)
Roman
More information about the clam-devel
mailing list