Re: [eigen] Allocation policy of eigen decompositions

[ Thread Index | Date Index | More Archives ]

2010/4/21 Adolfo Rodríguez Tsouroukdissian <adolfo.rodriguez@xxxxxxxxxxxxxxxxx>:
> On Wed, Apr 21, 2010 at 6:28 PM, Jitse Niesen <jitse@xxxxxxxxxxxxxxxxx>
> wrote:
>> On Tue, 20 Apr 2010, Benoit Jacob wrote:
>>> 2010/4/20 Adolfo Rodríguez Tsouroukdissian
>>> <adolfo.rodriguez@xxxxxxxxxxxxxxxx>:
>>>> I wanted to bring up some issues concerning the Eigenvalues module.
>>> I am adding Jitse in CC to ping him, although he is reading this list
>>> anyway; he has been improving the Eigenvalues module a lot recently so
>>> might be interested.
>> I'm definitely interested. However, I didn't have time to reply earlier. I
>> don't see a problem with the changes proposed by Benoit. I'm glad to hear
>> that the template Options parameter in JacobiSVD is on the way out.
>> ReturnByValue is your friend and easy to use.
>>> That said, in the case of
>>> HessenbergDecomposition<MatrixType>::matrixQ(), we can and should do
>>> much better than that: return a HouseholderSequence object, describing
>>> the set of householder vectors stored in the matrix. That is not only
>>> much more efficient algorithmically, that also avoids the malloc. For
>>> example, see HouseholderQR::householderQ().
>> I guess I don't understand how HouseholderSequence works. In fact, I
>> thought about using it when rewriting HessenbergDecomposition, but I
>> refrained from doing so because I had trouble understanding the code.
>> In this code,
>>   MatrixXd A = MatrixXd::Random(100,100);
>>   HouseholderQR<MatrixXd> houseOfA(A);
>>   MatrixXd Q = houseOfA.householderQ();
>> I'd have thought that the last line constructs a temporary
>> HouseholderSequence objects which holds a m_vectors member (matrix of size
>> 100 x 100), calls its evalTo() method, and throws the temporary object away.
>> Isn't there a malloc needed to allocate the memory in which the m_vectors
>> member is stored?
>> And here is my list of issues that may require API changes.
>> Chaining decompositions
>> =======================
>> As Adolfo says, ComplexSchur uses HessenbergDecomposition. If you call
>> ComplexSchur::compute(matrix), then the ComplexSchur object creates a
>> HessenbergDecomposition object, calls its compute() method, and copies the
>> matrices computed into its own member variables. One problem is that
>> creating the HessenbergDecomposition object causes a memory allocation. That
>> is the issue that Adolfo is concerned with.
> Well, what I did in the patch was to make the  HessenbergDecomposition
> instance a member of ComplexSchur, so it is no longer allocated in
> compute(). This pollutes a bit the protected class interface, but lets you
> call compute(matrix) multiple times with same-sized matrices without
> incurring in multiple HessenbergDecomposition instance allocations. What
> remains to be solved is the return by value issue.
>> Another problem is that this uses more memory than necessary. The
>> HessenbergDecomposition object needs memory and the ComplexSchur object
>> needs memory. Ideally, they should use the same memory, instead of copying
>> the matrix. What is the best way to do that? Perhaps ComplexSchur can use
>> the memory under control of HessenbergDecomposition by casting away the
>> const-ness of the const reference returned by
>> HessenbergDecomposition::packedMatrix(), but this is not very kosher.
>> The same issue arises in other contexts in the Eigenvalues module:
>> RealSchur uses HessenbergDecomposition, EigenSolver uses RealSchur,
>> ComplexEigenSolver uses ComplexSchur. In all these cases there is potential
>> for re-using memory, reducing the memory footprint. Whether it's worthwhile
>> to save O(n^2) memory in an O(n^3) algorithm is another matter ... perhaps
>> not the most pressing issue.
>> Merging EigenSolver and ComplexEigenSolver
>> ==========================================
>> My other big question is whether to merge the EigenSolver and
>> ComplexEigenSolver classes. Intuitively, I should say yes, and that's also
>> what the ToDo list on the wiki says. However, the interface of EigenSolver
>> and ComplexEigenSolver are rather different. EigenSolver stores the
>> eigenvalues and the pseudo-eigenvectors, so EigenSolver::eigenvectors()
>> returns MatrixType (this can be replaced via the ReturnByValue mechanism but
>> that does not matter for this discussion). ComplexEigenSolver stores the
>> eigenvalues and the eigenvectors, so ComplexEigenSolver::eigenvectors()
>> returns a const ref to MatrixType. How do we merge them?
>> One possibility is to have eigenvectors() in the merged EigenSolver class
>> return a MatrixType if the scalar type is real and a const ref to MatrixType
>> if the scalar type is complex. This is similar to eval() returning sometimes
>> a matrix and sometimes a const ref. However, it may be confusing to users.
>> The second possibility is to have the merged EigenSolver class store the
>> eigenvectors and not the pseudo-eigenvectors in the real case. However, this
>> costs twice as much memory because the eigenvectors are complex and the
>> pseudo-eigenvectors are real. As the real case is the important case, this
>> is not ideal.
>> Then there is the minor issue that (real) EigenSolver has a
>> pseudoEigenvector() method and ComplexEigenSolver has not. However, we can
>> include a pseudoEigenvector() method in the merged EigenSolver class, and
>> have it return the eigenvectors in the complex case. I don't think this is a
>> problem.
>> There is a third possibility: not to merge (real) EigenSolver and
>> ComplexEigenSolver.
>> For bonus points, discuss merging RealSchur and ComplexSchur.
>> Cheers,
>> Jitse
>> PS: Adolfo, the Eigenvalues part of your patch looks good, at first sight.
>> Incidentally, how do you test that there are no malloc's?
> Mmmh,  I used a hand-made approach. Consider an example:
> ...
> const int size = 5;
> MatrixXd A(MatrixXd::Random(size, size));
> Eigen::LLT<MatrixXd> LLT(size);
> // LLT.compute(A); // <- Notice commented-out line
> ...
> I would compile this first and run it with valgrind, which prints out
> something like:
> total heap usage: 10 allocs, 10 frees, 267,992 bytes allocated
> Then I would uncomment the last line and run with valgrind again. If no more
> allocs are made, we're good to go. Now, there must be a more civilized way
> of doing this, but this was my quick&dirty solution.

This is a quite good approach. If you want something 'civilized', you
can leverage the fact that all of Eigen's heap allocations go through
a few functions in Memory.h: ei_aligned_malloc,
ei_conditional_aligned_malloc<false>, ... i think that's it. Grep for
EIGEN_NO_MALLOC, actually.

So you could implement a check or counter using some global boolean
variable there to turn it on and off... of course we don't want that
in Eigen but you could easily hack that locally.


> HTH,
> Adolfo

Mail converted by MHonArc 2.6.19+