[Clam-devel] residual spectrum line segment approximation?

Xavier Amatriain xavier at amatriain.net
Tue Jul 15 14:55:31 PDT 2008


Hi Roumbaba, and congrats for your progress!

You are right on the source of your problem: SMSSynthesis expects your 
residual to come with an analysis window and if not things are likely to 
mess up.

The lines that are "guilty" for that are around SMSSynthesis.cxx:252

http://clam.iua.upf.edu/doc/CLAM-doxygen/SMSSynthesis_8cxx-source.html#l00252

First the peaks are synthesized into a sinusoidal spectrum. Then the two 
spectrums are added. Already at that point the spectrums are supposed to 
have the same analysis window (BH92) and size. The effect of that window 
is undone in line 261 when the global spectral synthesis is performed.

The issue here is that you need to guarantee that both spectrum come 
from a similar place before adding them... The sinusoidal peaks are 
reconstructed by convolving by the transform of the main lobe of the 
window (BH92) but you are reconstructing the residual in a different 
way. So.... you either apply the BH92 transform to your spectrum or 
avoid doing that in the peak synthesis (and then avoid multiplying by 
the inverse in the global spectral synthesis). None of the two options 
are immediate but I'd say the first one should be easier to work out.

Hope it helps... and if you get it to work don't forget to report back.

roumbaba wrote:
>
>
> Hello all and thanks again for your previous help,
>
> So I have written some matlab script to perform noise spectrum line 
> segment approximation.
>
> - As input the script  takes  an sdif file generated by analysis with  
> SMSConsole.
> - It then reads all sdif frames, in particular the 1STF frames 
> containing the noise spectrums in complex form.
> - It converts these complex spectrums into magPhase form
> - It performs line segment approximation on the amplitudes.
>
> To check the impact of the approximation on the quality of  
> resynthesis the script does the following:
> - It  reconstructs  full noise magnitude spectrums from the line 
> approximations  (by linear interpolation)
> - It randomizes the phases
> - It converts the new "smoothed" magPhase spectrums back to complex 
> spectrums
> - It writes back  the sdif file with these new "smoothed" spectrums 
> instead of the original raw noise spectrums.
>
> Then I run SMSConsole to synthesize that sdif file with the exact same 
> parameters than for the original sdif file.
> My problem is that the resulting synthesised noise sounds like 
> something is wrong in the synthesis overlap-add (like lots of 
> discontinuites in the resynthesis)
> I think that this might be due to what is described in  the 
> Serra/Smith 1990 CMJ paper concerning line segment approximation noise 
> resynthesis:
>
> " ...Since the [new] phase spectrum used is not the result of an 
> analysis process (with windowing of a waveform, zero padding, and FFT 
> computation), the resulting signal does not tapper to 0 at the 
> boundaries. This is because a phase spectrum with random values 
> corresponds to a phase spectrum of a rectangular-windowed noise 
> waveform of size N. In order to succeed in the overlap-add resynthesis 
> (ie, to obtain smooth transitions between frames) we need a smoothly 
> windowed waveform of size M, where M is the synthesis-window length. ....
> "
>
> So what might be happening is that by default SMSConsole assumes that 
> the 1STF frames are *NOT* line segment approximation and therefore 
> does *NOT* perform that last windowing at synthesis time. I have gone 
> a little bit through SMS/Clam code but I cannot find where I can 
> change this behavior or even if that is the default behavior. Where 
> shoud I look in the SMS/Clam code?
>
>
> Thanks,
>
> Roumbaba
>
>
>
> On 27 mai 08, at 23:25, Xavier Amatriain wrote:
>
>> Hi Roumbaba,
>>
>> In the paper you cite it says "you can", which does not mean "you 
>> have to" :-) Doing an approximation of the residual model is indeed
>> an interesting thing to do, especially if you want to reduce the 
>> amount of data in your transformed signal, however it is not a must.
>> Note that there are many other ways to model the residual apart from 
>> the one mentioned in that paper.
>>
>> So far, in CLAM we are using the residual as is, with no modeling or 
>> approximation. The "only" downside is that the transformed
>> signal (SMS Data) is in fact larger than the original audio when it 
>> could be much smaller with not much loss in quality. If for
>> whatever reason you do need to do the residual modeling you can look 
>> at the SpectralEnvelopeExtract processing. This processing
>> generates a spectral approximation (spectrum in bpf format) but from 
>> an array of peaks, it would not be hard to modify it to work
>> with an input spectrum.
>>
>> X
>>
>>
>> roumbaba wrote:
>>> Hi all,
>>>
>>> I am trying to understand how the residual spectrum gets modeled in 
>>> clam/SMS. I have read the Serra/Smith 1990 CMJ paper and as I 
>>> understand it  it describes two steps:
>>> 1- substract the harmonic spectrum from the original spectrum
>>> 2- perform a line-segment approximation of the residual spectrum 
>>> obtained in 1
>>>
>>> I have stepped through clam and SMS code and I think I can see where 
>>> step 1 gets performed:
>>>
>>> SMSAnalysisCore::Do()
>>> {
>>>
>>> mSinSpectralAnalysis.Do();
>>> mResSpectralAnalysis.Do();
>>> ...
>>> ...
>>> ...
>>> mSynthSineSpectrum.Do();
>>> mSpecSubstracter.Do(); /* step 1 gets performed here I think*/
>>>
>>> }
>>>
>>>
>>> but I cannot find where step 2 (line approximation) gets performed. 
>>> Where should I look in the code?
>>>
>>> Thank you very much,
>>> Cheers,
>>>
>>> Roumbaba
>>>
>>> ps:
>>>
>>> Here is a quote from the paper I mentionned above:
>>>
>>> "Approximation of the Spectral Residual
>>>
>>> Assuming the the residual signal is quasi-stochastic, each 
>>> magnitude-spectrum residual can be approximated by its envelope 
>>> since only its shape contributes to the sound characteristics. [...] 
>>> The particular line-segment approximation performed here is done by 
>>> stepping through the magnitude spectrum and finding local maxima in 
>>> every section, ..."
>>>
>>>
>>> _______________________________________________
>>> Clam-devel mailing list
>>> Clam-devel at llistes.projectes.lafarga.org
>>> https://llistes.projectes.lafarga.org/cgi-bin/mailman/listinfo/clam-devel 
>>>
>>
>>
>> _______________________________________________
>> Clam-devel mailing list
>> Clam-devel at llistes.projectes.lafarga.org
>> https://llistes.projectes.lafarga.org/cgi-bin/mailman/listinfo/clam-devel 
>>
>
>
> _______________________________________________
> Clam-devel mailing list
> Clam-devel at llistes.projectes.lafarga.org
> https://llistes.projectes.lafarga.org/cgi-bin/mailman/listinfo/clam-devel





More information about the clam-devel mailing list