parumi at iua.upf.edu
Wed Jun 13 06:11:11 PDT 2007
En/na abe ha escrit:
> Hi Liviu,
> If you need to calculate acoustic similarity, you can use dynamic time
> warping (DTW, http://en.wikipedia.org/wiki/Dynamic_time_warping ), which
> basically windows the speech from the target and test files, extracts
> features for each frame, and aligns them in the best possible way. The
> algorithm defines a cost for insertion and deletion of frames and the
> similarity of the features, so the overall cost (or the cost normalized
> for the length of the file) provides a good measure for difference.
> There's other refinements to the algorithm.
> Clam can do the feature extraction, but I'm not sure if there's the dtw
> algorithm (I'm in the process of learning clam so I'm not an expert), so
> that might be something you'll have to do yourself. I've used this
> methodology for comparing accents (native speaker vs non-native speaker
> reading the same sentences). Back then I used HTK for the feature
> extraction and a perl script for the dtw.
> Hope this helps and maybe stimulates more ideas for doing it with clam...
Abe, I found your explanation very interesting.
As you imagine, Clam does not have a DTW, though it would suit
quite well to the framework.
> David García Garzón wrote:
>> Hi, Liviu.
>> Do you mean speech recognition? Speaker recognition? Just sound
>> classification? Which is the concrete use case? I am not sure of what
>> you mean but your statement seems too general to be achieved.
>> Depending on your purpose you might be in a research bleeding edge,
>> specially if you go to the semantic level.
>> CLAM currently has no sound classification system but it has many of
>> the building blocks such systems use. That's better than starting from
>> A new document we added to the wiki  gives you an overview on the
>> steps to get introduced into CLAM:
>>  http://clam.iua.upf.edu/wikis/clam/index.php/Approaching_CLAM
>> If you need more help, just ask.
>> On Tuesday 12 June 2007 16:44:20 Liviu Macoviciuc wrote:
>>> I am a newbie to CLAM and I don' t understand much.
>>> However, I need to write a program that says if 2 audio files are
>>> For example, a file might contain a voice saying "I am John", and
>>> file the same voice or another voice saying "I am Bill"
>>> Can anybody help me to get started ?!
>>> Best regards,
>> CLAM mailing list
>> CLAM at iua.upf.es
> CLAM mailing list
> CLAM at iua.upf.es
More information about the clam-users