[Clam-devel] [PATCH] (working in progress) Harmonizer

Thu Jul 5 17:24:57 PDT 2007

On 7/5/07, Pau Arumi <parumi at iua.upf.edu> wrote:
> En/na Hernán Ordiales ha escrit:
> > i'll post here another mail about that and looking for suggestions to
> > improve it
> > take in account that by default residuals are ignored (maybe i'll
> > toggle default value, what do you think?)
>
> since there are so many controls i'd like to check with your
> results, could you post some audio file? (take your time)

most time (but not always) i get artifacts too (sometimes sounding
like R2-D2 voice :-) )

i will send audio examples soon...

i need advices on how i can improve it, here my Do() function:

bool SMSHarmonizer::Do( const SpectralPeakArray& inPeaks,
			const Fundamental& inFund,
			const Spectrum& inSpectrum,
			SpectralPeakArray& outPeaks,
			Fundamental& outFund,
			Spectrum& outSpectrum
		      )
{
	//Voice 0 (input voice)
	outPeaks = inPeaks;
	outFund = inFund;
	outSpectrum = inSpectrum;

	//TODO - skip if gain<0.01, check if outputs arrive clean or not
	TData gain0 = mVoice0Gain.GetLastValue();
	mSinusoidalGain.GetInControl("Gain").DoControl(gain0);
	mSinusoidalGain.Do(outPeaks,outPeaks);

	SpectralPeakArray mtmpPeaks;
	Fundamental mtmpFund;
	Spectrum mtmpSpectrum;

	for (int i=0; i < mVoicesPitch.Size(); i++)
	{
		TData gain = mVoicesGain[i].GetLastValue();
		if (gain<0.01) //means voice OFF
			continue;

		TData amount = mVoicesPitch[i].GetLastValue();
		mPitchShift.GetInControl("PitchSteps").DoControl(amount);
		mPitchShift.Do( inPeaks,
				inFund,
				inSpectrum,
				mtmpPeaks,
				mtmpFund,
				mtmpSpectrum);

		mSinusoidalGain.GetInControl("Gain").DoControl(gain);
		mSinusoidalGain.Do(mtmpPeaks,mtmpPeaks);

		outPeaks = outPeaks + mtmpPeaks;
		if (mIgnoreResidualCtl.GetLastValue()<0.01) // is 0
			mSpectrumAdder.Do(outSpectrum, mtmpSpectrum, outSpectrum);
	}
	return true;
}

maybe overload (make a new one) SMSPitchShift processing which forgets
about residual?

question:
what's the difference between something like:
mSpectralPeakArrayAdder.Do(outPeaks,mtmpPeaks;,tmpoutPeaks);
(i use tmpoutPeaks' because SpectralPeakAdder cannot process inplace)
and
outPeaks = outPeaks + mtmpPeaks; ?

> > what to do?
> >
> > - One fixed harmonizer processing with fixed amount of controls and
> > protoype
> > - One harmonizer processing with dinamic amount of voices, but without
> > a prototype???
> > - Both (2 different processings)
>
> a single processing with #voices configuration. the prototyping
> binding is done (else big surprise) after configuration.

ok, i will add configuration to the processing, also for the residual
option (and with bounded amount of voices of course)

> >> what do you mean exactly. what and when in-controls should be
> >> remembered?
> >
> > For example if i want to remember 'gains' or frequencies and have a
> > saved network of that
> > In this case, have a network configured with specific voices and gains
> > ready to play...
> > In spectralnetwork, nice values of robotization, gains, filter, freq,
> > etc...
>
> yes. it's a reasonable feature.
> but i'm not sure how network-saved in-controls should interact
> with in-control init at configuration time.
> my first thought is that network-saved should preval (applied
> after config)

agree

> hernan, file it as a feature request, please.
> anyone wanting to work on this?

ok, done

id: 279
https://projectes.lafarga.cat/tracker/index.php?func=detail&aid=279&group_id=24&atid=174

> > ATM is always the mean value between min and max (for example in
> > robotization i think is bettter a '0' value by default)
>
> yes we need this new interface/feature: incontrol.SetDefault(0)
> which implies that in cases where a default is given,
> GetDefault() won't return the mean but the specified value.
> it seems an easy hack, i'm open to receive a patch :-)

ok, i take it :-)

On 7/5/07, Xavier Amatriain <xavier at create.ucsb.edu> wrote:
> Residuals are ignored because the small addition in quality does not
> justify the huge performance penalty.

yes, i read your comment in source code:

		/**
		 *	xamat: adding residual does not improve results much and adds a
lot of overhead, there should
		 *	probably be a configuration parameter to control whether we want
to add residual or not, but that
		 *	would mean changing the kind of configuration. For the time being
the output residual is the input.
		 */

>> On this same line adding a dynamic number of voices might be dangerous. As I
>> explained in my previous email adding many peaks is translated into many
>> performance
>> issues.
> However, a configurable but _bounded_ number of voices is not a
> problem.

(already said, but to clear doubts: i will add configuration to the
processing with bounded amount of voices)

-- 
Hernán
http://h.ordia.com.ar
GnuPG: 0xEE8A3FE9