[Clam-devel] My adventures with sample by sample processing (part II)

Xavier Amatriain xavier at create.ucsb.edu
Wed Jan 3 09:45:35 PST 2007


This is a longish email. I was tempted to edit it as a wiki page but we
can always
do that copying and pasting the email.

After a few days profiling and debugging here are my conclusions as a
follow-up
from a previous email. The testbed has been using CLAM 0.95 as a base,
and for
external reasons VisualStudio and Rational Quantify for profiling. But
most of the 
conclusions apply to any OS.

Port handling overhead
----------------------

In sample-by-sample processing where the grain of the do operations is
fine all
overhead is very relevant. The current overhead imposed by the region
management
in the Network mode is clearly not acceptable. In some sample
processings it can
account for 80% of the focus or more.

I addressed this issue by adding overloads of most functions (just as a
test) related
to this. For instance InPort::Consume() or Region::WriterHasAdvanced().
In those
overloads I simply took advantage of the fact that reading/writing
size=1. Only doing
this greatly improved the overhead (around 50% or more). But this is
still not
enough so a complete rework has to be done on this area for being
practical on 
sample by sample processing.

In my case, I basically removed all ports from inside
ProcessingComposites and went
back to using parametrized Do's. Still the overhead in the outermost
network ports
gives me problems.


Scheduling
----------

Another issue I encountered was the huge overhead imposed by the
Scheduling policies
that we have. Push simply did not work with the simplest case and Basic
worked but up
to a point. The worst part is how much time is spent in the
CanConsumeAndProduce
operation.

It was clear that in these cases a static scheduling is a must so I have
added one that
I hope to commit to the svn repository shortly. The static scheduler
allows the user to
define branches of processings and firing schemes.


Token Delay
-----------

The current implementation based on a queue is really inefficient. The
problem is that
the push_back operation implies a reallocation.

I had to do an emergency implementation based on a circular buffer. This
solution is a little
ad-hoc and knowing that in the past we have run away from in-house
circular buffers I will
wait for further comments in order to proceed.

GUI-related issues
------------------

I am not so worried about these as they are only specific to the Network
Editor. But it
should be noted that the MainWindow::event callback takes about 50% of a
regular execution
time. However, this is on another thread so I am not so sure the
profiling information
is that reliable here.


In all, it is clear that the model is valid and workable but we need a
little refactoring in
order to make it feasible. It is good though that we have at least a
first approach and
testbed for this development and it should not take much time.




-- 

/*********************************
 *       Xavier Amatriain        *
 *  Associate Director - MATi    *
 *  Research Director - CREATE   *
 *    UCSB, Santa Barbara CA     *
 *      1-(805)- 893 83 52       *
 ********************************/






More information about the clam-devel mailing list