Re: [eigen] Sparse usage feedback

[ Thread Index | Date Index | More lists.tuxfamily.org/eigen Archives ]


On Mon, 18 Oct 2010 15:10:34 +0200, Gael Guennebaud

<gael.guennebaud@xxxxxxxxx> wrote:

>> I wrote iteratorS because as you only have one operator*(), you'd have

to

>>

>> give more than one iterator, or you'd make your user pay for unused

data.

>>

>> (i.e you could give rowsCols(j).begin()/.end() with operator*()

returning

>>

>> and std::pair<IndexType, IndexType>

>>

>> and values(j).begin()/.end() returning a Scalar.

>>

> 

> 

> ok, but in many cases you need both the values and the coordinates in

> row/col or inner/outer. For instance, in Eigen itself all usages of

> the InnerIterator make use of the trio value/outer/inner.  So we would

> need a ton of different iterators to satisfy all needs. Also

> std::pair<> is a very bad thing as it makes code unreadable. What is

> first() ? what is second() ?





Yes, I understand your point of view. There is no easy & obvious

"best option" (otherwise we'd agree on it at once :) ) but a tradeoff

between



1°) have only one iterator with all the data available through custom

member function



pro: smaller API footprint (only one class), perfect fit for Eigen itself

con: must be adapted to algorithms implementations expecting "usual"

iterators



2°) having various iterators 

3 ? one over Coords2D { Indextype inner; IndexType outer;};

one over Scalar, and one over Data {Coords2D coords; Scalar value;}; ?



pro: easier to learn & use when used to the usual C++ iterators &

containers

, easier to interface with C++ implementations expecting such iterators

con: larger API footprint (more classes instead of three methods) but

maybe not so much larger cognitive footprint

(the classes could also be resused elsewhere I believe).





> 

> Actually, iterators on values and "inner indices" are very easy to do

> since they are just Scalar* and Index*. For instance something like:

> 

> for(int j=0;j<m.outerSize();++j)

>   std::for_each(m.innerVector(j)._valuePtr(),

> m.innerVector(j)._valuePtr()+m.innerVector(j).nonZeros(), …);

> 

> should already work for sparse objects with storage.

> 

> Also in case I move from the current iterator mechanism to the

> begin/end paradigm (while keeping the rest intact, i.e., with the full

> set of attributes like value(), row(), col()), then it would be

> straightforward to write custom adaptators to STL compatible

> iterators. To give an idea:

> 

> template<typename EigenIterator>

> class value_adaptator {

>   EigenIterator m_it;

> public:

>   value_adaptator(const EigenIterator& it) : m_it(it) {}

>   operator++() { m_it++; }

>   Scalar& operator* () {return m_it.value();}

> ....

> };

> 



Indeed. I'm only advocating to make such iterators available (maybe in

unsupported)

so that users do not keep reimplementing them.

 

> 

> Actually, Eigen already allows you to perform a lot of operations, and

> I'd be interested to know what you are missing to have to fallback to

> the STL algorithms. Actually, I plan to add a method returning the non

> zeros as a dense Eigen vector. We could also to the same with the

> indices. I guess that that alone would already fill many of your needs

> ?



I understand that Eigen is unlike the usual containers libraries because

it does not use algorithms & iterators to abstract away containers because

expression templates *use* the containers directly to optimize for

performance.



However, you cannot implement every visitor & reduction, so when using

Eigen::Matrix

as containers instead of high performance mat structure (i.e. when doing

io, interacting with

other components), it can be useful to trade the high performance for the

standard 

algorithms on any callable type.

(what if I want the sequence of coords +1 to feed some fortran speaking

external module ?)





>> However, I think that Sparse classes should have constructors taking

>> dense

>> matrices

>> calling sparseview() because if you can construct an object of class A

>> from

>> an object of classe B, the idiomatic way to do it in C++ is to have a

A(B

>> const&)

>> constructor.

> 

> Yes then we could also have an explicit constructor pruning elements

> which are exactly 0.

> 

> I think this was the plan, so patch welcome :)

> 

> 

> Note that the sparseView approach is still needed because it offers

> many more features:

[snip]

> S * SparseMatrix(D); // create a temporary sparse matrix object as a

> copy of the non zeros of D, and returns a sparse object

> 



Don't get me wrong I like the sparseVew() are realize it's importance.

It is just that I'd have expected to be able to create a Sparse directly

from a Dense.

I'll try a patch :)



> Yes, in this case this is safe regarding the accuracy, but I think

> that when we hit such a mix of integer/doubles, there are 3

> possibilities:

> 

> 1 - the user made a mistake (typo), and this is not what he wanted to

> do (might be hard to debug if we allow this)

> 

> 2 - this is indeed what the user wanted to do, but if we silently do

> the (very) *expensive* cast for him he won't notice it while maybe, it

> would be better (faster) to directly use double, or make a copy as

> double because the cast happens many times of the same data. So the

> idea is that by enforcing the user to write the cast we somehow

> enforce him to think a bit about his algorithm and the appropriate use

> of scalar types.

> 

> 3 - justified use case. Even in this case I think it is good thing to

> not hide the cast to help readers to better understand the code and

> what really happen.



You are very right !

I forgot the expensiveness of the int→double conversion.

I was arguing against gratuitous "syntaxic salt"

(what is the opposite of syntaxic sugar" ? :) ) for safe operations

but you a re right that we do not want to make slow operations too easy

(as in invisible in code).

If I can locate the relevent doc, I'll propose a patch

(as in "provided mixed-type operations exclude unsafe operations (i.e.

loss of precision)

or slow (i.e. integral to floating point conversion) operations.").



> currently in my local hard drive, I'll commit/push it soon for

discussions.

> 

> 

> 

> Yes you are right, for sparse objects dynamic dispatching is fine.

> Typically, the shape (selfadjoint-ness, triangular-ness, etc.) of the

> matrix could also be runtime flags... This puzzled me for a while. The

> problem is that I would also like that the API is as close as possible

> to the dense API, such that it is possible to write codes which work

> for both worlds... So I kept the triangular/selfadjoint-View

> mechanisms.



Indeed, regularity is the first quality of an API.



> 

> For the decompositions, the plan is to have one "low level" class per

> algorithm, with a common API (compute()/solve()) and specific

> fine-tunning features. For instance the new CholmodDecomposition has

> options to control some basic aspects such as whether you want a

> LLt/LDLt, simplicial/supernodal, etc. plus a full access to

> cholmod_common object.

> 

> Then we could imagine to have a unified solver class built on top of

> these low level classes with runtime dispatching as you did with a

> runtime manager to register/remove solvers on the fly, etc. I see it

> as a separated module.



I'm eagerly awaiting and will keep you up-to-date with my Eigen::Sparse

usage.



Thanks for the great libs (and the insightful answers)!



Best regards,



Bernard



Mail converted by MHonArc 2.6.19+ http://listengine.tuxfamily.org/