[Clam-devel] [patch] patch for finding polynomial roots: the route to formants
David García Garzón
dgarcia at iua.upf.edu
Mon Jul 9 08:13:35 PDT 2007
On Thursday 05 July 2007 19:18:33 abe kazemzadeh wrote:
> Hi All,
>
> Here's a patch of what I've been working on. I added a method to LPModel
> to get the roots (poles) of the LPC coefficients. Since it is a general
> method,
Not sure if this is not already on Sandra's code, did you checked it?
> not specific to LPC, I made a new class, Polynomial, in CLAM/src/Standard.
> I'm not sure if this is the best location to put it.
Standard? It is.
> Also, in this class
> there are
> methods that might be better in separate classes (eg, finding the
> eigenvalues
> of a matrix: there seem to be some stubs for this purpose in
> CLAM/src/Standard/MatrixTmplDec.hxx:217), so let me know if anyone has
> ideas about this.
I think that the method is ok where you put it. I don't like at all the Matrix
object. I would like to reduce any remaining use of Matrix to drop it. We
could eventually adopt any good matrix library as dependency. Will you mind
using a plain std::vector instead a matrix?
> Overview:
> I added a method to LPModel, LPModel::ToRoots(), which just calls the root
> solving function of the new class, Polynomial::PolyRoots() on the lpc
> coefficients.
> PolyRoots takes the lpc coefficients, creates a companion matrix
> (Polynomial::BuildCompanion()), balances it,and then gets the eigenvalues
> of
> this matrix (Polynomial::EigenHessenberg()). This output is the roots,
> which
> are the formants of speech. However, I haven't converted them from their
> complex
> representation to the frequency yet. Also, for actual formant tracking, I
> still need to add
> ways to smooth the output. It's still a bit rough, but I wanted to get
> feedback before
> further work.
I have no knowledge on the subject, but seems ok to me.
> Some issues:
>
> -right now the default lpc order is 11. This seems like the textbook value
> that
> people quote, but I think that it applies to speech coding in 8kHz, so for
> higher
> sampling rates it might be better to have a higher
> order. I dug out the notes from
> the speech processing class that I took and there is a nice derivation of
> how to pick
> the order (based on the sampling rate, the length of the vocal tract, and
> the speed of
> sound). I'm not sure how feasible it would be to get the LPModel class to
> configure
> itself based on the sampling rate, but this would make it convenient for
> the user. Either
> that or downsampling to 8kHz before calculating the LPC/formants (it seems
> that that
> might be what Sandra Gilabert did).
If you know how to estimate the order taking into account the sampling rate i
think it is the nicer way. But also the resampling can be convenient.
Sandra's code included a resampler but we were thinking about adding
resampling processing module to clam by using the libsamplerate [1] library.
[1] http://www.mega-nerd.com/SRC/index.html
> -I implemented one of the algorithms that I found online that seemed good.
> I
> considered LAPACK++, but didn't use it b/c it seemed like a lot to learn
> when I
> was already learning clam. In retrospect, it took me a fair amount of time
> to
> translate, debug, and test the algorithm I used, making LAPACK seem good in
> retrospect,
> so I wanted to see if anyone on the list is familiar with it LAPACK.
Not me. But isn't there any FOSS library which does that?
> -one of the constants in the algorithm is epsilon, a tiny value such that
> anything
> less is negligible. I was trying to see if CLAM has such a constant and I
> found
> something in CLAM/test/UnitTests/cppUnitHelper.hxx:174, but I couldn't
> figure out
> what it was or how to use it.
Epsilon in tests is used to accept two double values as equal. Does it give
you any conflict?
> -The patch has some extraneous details, but I didn't want to edit the
> file b/c I'm not sure if
> that would screw up the patch format (eg, taking out what seems to be a
> binary file and changes
> unrelated to the work I"m submitting). Also, if anyone has any knowledge
> about how to make
> emacs give the same indentation format as the clam style, that would be
> good.
To clean up the patch you can specify the list of files or directories you
want to be considered for the patch.
> -I tested out the algorithm manually, but I was wondering about automatic
> tests: would that be
> good or necessary, and if so, could someone point me in the right direction
> for doing this.
An algorithm can be covered by Back to Back tests or Unit tests:
- If some (simple) input has some known output you can manually generate them
and test as in a back to back.
- If you have a reference implementation generate I/O data for the B2B test.
- If you hand checked for the results on real data, do a back to back test to
be warned when the results change. You should provide some criteria to
validate further changes.
- That can be automated when your algorithm has some fitness criteria. I mean,
if you are testing a segment extractor and you have a hand annotated wave and
a fitness function.
I have to rewrite this as a wiki page.
More information about the clam-devel
mailing list