[Clam-devel] residual spectrum line segment approximation?
roumbaba at gmail.com
Mon Jul 14 16:27:12 PDT 2008
Hello all and thanks again for your previous help,
So I have written some matlab script to perform noise spectrum line
- As input the script takes an sdif file generated by analysis
- It then reads all sdif frames, in particular the 1STF frames
containing the noise spectrums in complex form.
- It converts these complex spectrums into magPhase form
- It performs line segment approximation on the amplitudes.
To check the impact of the approximation on the quality of
resynthesis the script does the following:
- It reconstructs full noise magnitude spectrums from the line
approximations (by linear interpolation)
- It randomizes the phases
- It converts the new "smoothed" magPhase spectrums back to complex
- It writes back the sdif file with these new "smoothed" spectrums
instead of the original raw noise spectrums.
Then I run SMSConsole to synthesize that sdif file with the exact
same parameters than for the original sdif file.
My problem is that the resulting synthesised noise sounds like
something is wrong in the synthesis overlap-add (like lots of
discontinuites in the resynthesis)
I think that this might be due to what is described in the Serra/
Smith 1990 CMJ paper concerning line segment approximation noise
" ...Since the [new] phase spectrum used is not the result of an
analysis process (with windowing of a waveform, zero padding, and FFT
computation), the resulting signal does not tapper to 0 at the
boundaries. This is because a phase spectrum with random values
corresponds to a phase spectrum of a rectangular-windowed noise
waveform of size N. In order to succeed in the overlap-add
resynthesis (ie, to obtain smooth transitions between frames) we need
a smoothly windowed waveform of size M, where M is the synthesis-
window length. ....
So what might be happening is that by default SMSConsole assumes that
the 1STF frames are *NOT* line segment approximation and therefore
does *NOT* perform that last windowing at synthesis time. I have gone
a little bit through SMS/Clam code but I cannot find where I can
change this behavior or even if that is the default behavior. Where
shoud I look in the SMS/Clam code?
On 27 mai 08, at 23:25, Xavier Amatriain wrote:
> Hi Roumbaba,
> In the paper you cite it says "you can", which does not mean "you
> have to" :-) Doing an approximation of the residual model is indeed
> an interesting thing to do, especially if you want to reduce the
> amount of data in your transformed signal, however it is not a must.
> Note that there are many other ways to model the residual apart
> from the one mentioned in that paper.
> So far, in CLAM we are using the residual as is, with no modeling
> or approximation. The "only" downside is that the transformed
> signal (SMS Data) is in fact larger than the original audio when it
> could be much smaller with not much loss in quality. If for
> whatever reason you do need to do the residual modeling you can
> look at the SpectralEnvelopeExtract processing. This processing
> generates a spectral approximation (spectrum in bpf format) but
> from an array of peaks, it would not be hard to modify it to work
> with an input spectrum.
> roumbaba wrote:
>> Hi all,
>> I am trying to understand how the residual spectrum gets modeled
>> in clam/SMS. I have read the Serra/Smith 1990 CMJ paper and as I
>> understand it it describes two steps:
>> 1- substract the harmonic spectrum from the original spectrum
>> 2- perform a line-segment approximation of the residual spectrum
>> obtained in 1
>> I have stepped through clam and SMS code and I think I can see
>> where step 1 gets performed:
>> mSpecSubstracter.Do(); /* step 1 gets performed here I think*/
>> but I cannot find where step 2 (line approximation) gets
>> performed. Where should I look in the code?
>> Thank you very much,
>> Here is a quote from the paper I mentionned above:
>> "Approximation of the Spectral Residual
>> Assuming the the residual signal is quasi-stochastic, each
>> magnitude-spectrum residual can be approximated by its envelope
>> since only its shape contributes to the sound characteristics.
>> [...] The particular line-segment approximation performed here is
>> done by stepping through the magnitude spectrum and finding local
>> maxima in every section, ..."
>> Clam-devel mailing list
>> Clam-devel at llistes.projectes.lafarga.org
> Clam-devel mailing list
> Clam-devel at llistes.projectes.lafarga.org
More information about the clam-devel