[Clam-devel] residual spectrum line segment approximation?
roumbaba at gmail.com
Thu Jul 17 12:48:59 PDT 2008
Thanks Xavier! That helps a lot.
So I'll try to apply a BH92 to my spectrum. I will do it in Matlab
for now to see if it work. I will let you know.
- In order to do it properly I need to understand why the analysis
windows are forced to odd size windows? ( the sms tutorial says <
(Res)AnalysisWindowSize>: Note: if the value entered is not odd, the
program will internally add +1 to it ) ... Probably for taking
advantage of symetry somewhere?
- Anyhow the real question is this: the size of the 1STF Matrix which
holds the complex spectrums is 513. How shall I go to apply the bh92
window on it?
I am thinking of this:
- Take the 513 bins complex spectrum
- build a corresponding 1024 (1025?) symmetric spectrum with the axis
of symmetry at bin 512 (513?)
- take the ifft of this spectrum
- multiply the result by the bh92 1024 window (value by value
multiply: value 1 of the bh92 window with bin 1 of the ifft obtain at
the previous step etc )
- take the fft of the resulting windowed signal
- Store this result in the 1STF frames for noise
Does that seem correct to you?
On 15 juil. 08, at 23:55, Xavier Amatriain wrote:
> Hi Roumbaba, and congrats for your progress!
> You are right on the source of your problem: SMSSynthesis expects
> your residual to come with an analysis window and if not things are
> likely to mess up.
> The lines that are "guilty" for that are around SMSSynthesis.cxx:252
> First the peaks are synthesized into a sinusoidal spectrum. Then
> the two spectrums are added. Already at that point the spectrums
> are supposed to have the same analysis window (BH92) and size. The
> effect of that window is undone in line 261 when the global
> spectral synthesis is performed.
> The issue here is that you need to guarantee that both spectrum
> come from a similar place before adding them... The sinusoidal
> peaks are reconstructed by convolving by the transform of the main
> lobe of the window (BH92) but you are reconstructing the residual
> in a different way. So.... you either apply the BH92 transform to
> your spectrum or avoid doing that in the peak synthesis (and then
> avoid multiplying by the inverse in the global spectral synthesis).
> None of the two options are immediate but I'd say the first one
> should be easier to work out.
> Hope it helps... and if you get it to work don't forget to report
> roumbaba wrote:
>> Hello all and thanks again for your previous help,
>> So I have written some matlab script to perform noise spectrum
>> line segment approximation.
>> - As input the script takes an sdif file generated by analysis
>> with SMSConsole.
>> - It then reads all sdif frames, in particular the 1STF frames
>> containing the noise spectrums in complex form.
>> - It converts these complex spectrums into magPhase form
>> - It performs line segment approximation on the amplitudes.
>> To check the impact of the approximation on the quality of
>> resynthesis the script does the following:
>> - It reconstructs full noise magnitude spectrums from the line
>> approximations (by linear interpolation)
>> - It randomizes the phases
>> - It converts the new "smoothed" magPhase spectrums back to
>> complex spectrums
>> - It writes back the sdif file with these new "smoothed"
>> spectrums instead of the original raw noise spectrums.
>> Then I run SMSConsole to synthesize that sdif file with the exact
>> same parameters than for the original sdif file.
>> My problem is that the resulting synthesised noise sounds like
>> something is wrong in the synthesis overlap-add (like lots of
>> discontinuites in the resynthesis)
>> I think that this might be due to what is described in the Serra/
>> Smith 1990 CMJ paper concerning line segment approximation noise
>> " ...Since the [new] phase spectrum used is not the result of an
>> analysis process (with windowing of a waveform, zero padding, and
>> FFT computation), the resulting signal does not tapper to 0 at the
>> boundaries. This is because a phase spectrum with random values
>> corresponds to a phase spectrum of a rectangular-windowed noise
>> waveform of size N. In order to succeed in the overlap-add
>> resynthesis (ie, to obtain smooth transitions between frames) we
>> need a smoothly windowed waveform of size M, where M is the
>> synthesis-window length. ....
>> So what might be happening is that by default SMSConsole assumes
>> that the 1STF frames are *NOT* line segment approximation and
>> therefore does *NOT* perform that last windowing at synthesis
>> time. I have gone a little bit through SMS/Clam code but I cannot
>> find where I can change this behavior or even if that is the
>> default behavior. Where shoud I look in the SMS/Clam code?
>> On 27 mai 08, at 23:25, Xavier Amatriain wrote:
>>> Hi Roumbaba,
>>> In the paper you cite it says "you can", which does not mean "you
>>> have to" :-) Doing an approximation of the residual model is indeed
>>> an interesting thing to do, especially if you want to reduce the
>>> amount of data in your transformed signal, however it is not a must.
>>> Note that there are many other ways to model the residual apart
>>> from the one mentioned in that paper.
>>> So far, in CLAM we are using the residual as is, with no modeling
>>> or approximation. The "only" downside is that the transformed
>>> signal (SMS Data) is in fact larger than the original audio when
>>> it could be much smaller with not much loss in quality. If for
>>> whatever reason you do need to do the residual modeling you can
>>> look at the SpectralEnvelopeExtract processing. This processing
>>> generates a spectral approximation (spectrum in bpf format) but
>>> from an array of peaks, it would not be hard to modify it to work
>>> with an input spectrum.
>>> roumbaba wrote:
>>>> Hi all,
>>>> I am trying to understand how the residual spectrum gets modeled
>>>> in clam/SMS. I have read the Serra/Smith 1990 CMJ paper and as I
>>>> understand it it describes two steps:
>>>> 1- substract the harmonic spectrum from the original spectrum
>>>> 2- perform a line-segment approximation of the residual spectrum
>>>> obtained in 1
>>>> I have stepped through clam and SMS code and I think I can see
>>>> where step 1 gets performed:
>>>> mSpecSubstracter.Do(); /* step 1 gets performed here I think*/
>>>> but I cannot find where step 2 (line approximation) gets
>>>> performed. Where should I look in the code?
>>>> Thank you very much,
>>>> Here is a quote from the paper I mentionned above:
>>>> "Approximation of the Spectral Residual
>>>> Assuming the the residual signal is quasi-stochastic, each
>>>> magnitude-spectrum residual can be approximated by its envelope
>>>> since only its shape contributes to the sound characteristics.
>>>> [...] The particular line-segment approximation performed here
>>>> is done by stepping through the magnitude spectrum and finding
>>>> local maxima in every section, ..."
>>>> Clam-devel mailing list
>>>> Clam-devel at llistes.projectes.lafarga.org
>>> Clam-devel mailing list
>>> Clam-devel at llistes.projectes.lafarga.org
>> Clam-devel mailing list
>> Clam-devel at llistes.projectes.lafarga.org
> Clam-devel mailing list
> Clam-devel at llistes.projectes.lafarga.org
-------------- next part --------------
An HTML attachment was scrubbed...
More information about the clam-devel