Re: [eigen] Allocation policy of eigen decompositions

On Wed, Apr 21, 2010 at 7:03 PM, Benoit Jacob <jacob.benoit.1@xxxxxxxxx> wrote:

2010/4/21 Adolfo Rodríguez Tsouroukdissian <adolfo.rodriguez@xxxxxxxxxxxxxxxx>:

>
>
> On Wed, Apr 21, 2010 at 6:28 PM, Jitse Niesen <jitse@xxxxxxxxxxxxxxxxx>
> wrote:
>>
>> On Tue, 20 Apr 2010, Benoit Jacob wrote:
>>
>>> 2010/4/20 Adolfo Rodríguez Tsouroukdissian
>>> <adolfo.rodriguez@xxxxxxxxxxxxxxxx>:
>>>>
>>>> I wanted to bring up some issues concerning the Eigenvalues module.
>>>
>>> I am adding Jitse in CC to ping him, although he is reading this list
>>> anyway; he has been improving the Eigenvalues module a lot recently so
>>> might be interested.
>>
>> I'm definitely interested. However, I didn't have time to reply earlier. I
>> don't see a problem with the changes proposed by Benoit. I'm glad to hear
>> that the template Options parameter in JacobiSVD is on the way out..
>> ReturnByValue is your friend and easy to use.
>>
>>> That said, in the case of
>>> HessenbergDecomposition<MatrixType>::matrixQ(), we can and should do
>>> much better than that: return a HouseholderSequence object, describing
>>> the set of householder vectors stored in the matrix. That is not only
>>> much more efficient algorithmically, that also avoids the malloc. For
>>> example, see HouseholderQR::householderQ().
>>
>> I guess I don't understand how HouseholderSequence works. In fact, I
>> thought about using it when rewriting HessenbergDecomposition, but I
>> refrained from doing so because I had trouble understanding the code.
>>
>> In this code,
>>
>> MatrixXd A = MatrixXd::Random(100,100);
>> HouseholderQR<MatrixXd> houseOfA(A);
>> MatrixXd Q = houseOfA.householderQ();
>>
>> I'd have thought that the last line constructs a temporary
>> HouseholderSequence objects which holds a m_vectors member (matrix of size
>> 100 x 100), calls its evalTo() method, and throws the temporary object away.
>> Isn't there a malloc needed to allocate the memory in which the m_vectors
>> member is stored?
>>
>>
>> And here is my list of issues that may require API changes.
>>
>>
>> Chaining decompositions
>> =======================
>>
>> As Adolfo says, ComplexSchur uses HessenbergDecomposition. If you call
>> ComplexSchur::compute(matrix), then the ComplexSchur object creates a
>> HessenbergDecomposition object, calls its compute() method, and copies the
>> matrices computed into its own member variables. One problem is that
>> creating the HessenbergDecomposition object causes a memory allocation. That
>> is the issue that Adolfo is concerned with.
>
> Well, what I did in the patch was to make the HessenbergDecomposition
> instance a member of ComplexSchur, so it is no longer allocated in
> compute(). This pollutes a bit the protected class interface, but lets you
> call compute(matrix) multiple times with same-sized matrices without
> incurring in multiple HessenbergDecomposition instance allocations. What
> remains to be solved is the return by value issue.
>>
>> Another problem is that this uses more memory than necessary. The
>> HessenbergDecomposition object needs memory and the ComplexSchur object
>> needs memory. Ideally, they should use the same memory, instead of copying
>> the matrix. What is the best way to do that? Perhaps ComplexSchur can use
>> the memory under control of HessenbergDecomposition by casting away the
>> const-ness of the const reference returned by
>> HessenbergDecomposition::packedMatrix(), but this is not very kosher.
>>
>> The same issue arises in other contexts in the Eigenvalues module:
>> RealSchur uses HessenbergDecomposition, EigenSolver uses RealSchur,
>> ComplexEigenSolver uses ComplexSchur. In all these cases there is potential
>> for re-using memory, reducing the memory footprint. Whether it's worthwhile
>> to save O(n^2) memory in an O(n^3) algorithm is another matter ... perhaps
>> not the most pressing issue.
>>
>>
>> Merging EigenSolver and ComplexEigenSolver
>> ==========================================
>>
>> My other big question is whether to merge the EigenSolver and
>> ComplexEigenSolver classes. Intuitively, I should say yes, and that's also
>> what the ToDo list on the wiki says. However, the interface of EigenSolver
>> and ComplexEigenSolver are rather different. EigenSolver stores the
>> eigenvalues and the pseudo-eigenvectors, so EigenSolver::eigenvectors()
>> returns MatrixType (this can be replaced via the ReturnByValue mechanism but
>> that does not matter for this discussion). ComplexEigenSolver stores the
>> eigenvalues and the eigenvectors, so ComplexEigenSolver::eigenvectors()
>> returns a const ref to MatrixType. How do we merge them?
>>
>> One possibility is to have eigenvectors() in the merged EigenSolver class
>> return a MatrixType if the scalar type is real and a const ref to MatrixType
>> if the scalar type is complex. This is similar to eval() returning sometimes
>> a matrix and sometimes a const ref. However, it may be confusing to users.
>>
>> The second possibility is to have the merged EigenSolver class store the
>> eigenvectors and not the pseudo-eigenvectors in the real case. However, this
>> costs twice as much memory because the eigenvectors are complex and the
>> pseudo-eigenvectors are real. As the real case is the important case, this
>> is not ideal.
>>
>> Then there is the minor issue that (real) EigenSolver has a
>> pseudoEigenvector() method and ComplexEigenSolver has not. However, we can
>> include a pseudoEigenvector() method in the merged EigenSolver class, and
>> have it return the eigenvectors in the complex case. I don't think this is a
>> problem.
>>
>> There is a third possibility: not to merge (real) EigenSolver and
>> ComplexEigenSolver.
>>
>> For bonus points, discuss merging RealSchur and ComplexSchur.
>>
>>
>> Cheers,
>> Jitse
>>
>>
>> PS: Adolfo, the Eigenvalues part of your patch looks good, at first sight.
>> Incidentally, how do you test that there are no malloc's?
>
> Mmmh, I used a hand-made approach. Consider an example:
>
> ...
> const int size = 5;
> MatrixXd A(MatrixXd::Random(size, size));
>
> Eigen::LLT<MatrixXd> LLT(size);
> // LLT.compute(A); // <- Notice commented-out line
> ...
>
>
> I would compile this first and run it with valgrind, which prints out
> something like:
>
> total heap usage: 10 allocs, 10 frees, 267,992 bytes allocated
>
> Then I would uncomment the last line and run with valgrind again. If no more
> allocs are made, we're good to go. Now, there must be a more civilized way
> of doing this, but this was my quick&dirty solution.

This is a quite good approach. If you want something 'civilized', you
can leverage the fact that all of Eigen's heap allocations go through
a few functions in Memory.h: ei_aligned_malloc,
ei_conditional_aligned_malloc<false>, ... i think that's it. Grep for
EIGEN_NO_MALLOC, actually.

I'll give that a try when the time comes to finish up the Eigenvalues module :)
BTW, I opened up a bitbucket account. My user name is adolfo.rodriguez

Adolfo

So you could implement a check or counter using some global boolean
variable there to turn it on and off... of course we don't want that
in Eigen but you could easily hack that locally.

Benoit

>
> HTH,
>
> Adolfo
>
>
>