Re: [eigen] Question about ColMajor vs. RowMajor

[ Thread Index | Date Index | More lists.tuxfamily.org/eigen Archives ]


Thanks for your comments.

I would like to stress that this issue is not specific to
operator()(index) and operator[]; it also exists with the data()
method returning a pointer to the underlying array of coefficients.

First of all, I think that it would be even more surprising if

   matrix(index)

wasn't the same thing as

   matrix.data()[index]

This means that the current behavior of operator()(index) is the only
possible one, and that changing it to e.g. always mean column-major
would be even more surprising, as it would make the above two
expressions return different values in the row-major case.

So I think that this is a documentation issue. There is lots that we
should do to make our API docs more useful, like, as I mentioned, let
class docs include the docs of all parent classes.

Cheers,
Benoit


2011/12/18 Radu B. Rusu <rusu@xxxxxxxxxxxxxxxx>:
> Benoit,
>
> Thanks a lot for the explanations!
>
>
> On 12/17/2011 07:21 AM, Benoit Jacob wrote:
>> 2011/12/16 Michael Dixon <mdixon@xxxxxxxxxxxxxxxx>:
>>>  Hi,
>>>
>>> My colleagues and I were recently playing around with changing the
>>> default for matrices from ColMajor to RowMajor, and we found that
>>> doing so changed the behavior of the () operator.  Specifically, when
>>> accessing a two-dimensional matrix with a single index (e.g., mat(i)
>>> ), the ColMajor/RowMajor flag affects which element of the matrix gets
>>> indexed as the ith.  Our assumption had been that changing the
>>> ColMajor/RowMajor flag would be completely transparent, which is to
>>> say, it would affect the underlying storage (and thus the
>>> performance/efficiency), but it wouldn't change the overall behavior
>>> of the code.
>>
>> This is mostly true, but it doesn't apply to the few parts of the API
>> that are explicitly about low-level addressing of the underlying array
>> of coefficients as a single-dimensional array. operator(index) is like
>> that. It is meant for use cases where you just want to iterate over
>> coefficients in the most memory-local way, and don't care if that
>> means col-major or row-major iteration.
>
> Though I completely see the reasoning behind this, I am a bit worried that most people won't be able to differentiate
> between "low-level" API and the rest. It's almost like one would want the operator that is used the most to have the
> "slower but safer" behavior (if it's a choice and if its behavior is expected to change depending on compile flags), and
> then have another way to get data faster (i.e., most memory-local) that is explicit. For example, if people misuse the
> () operator and are not reading the documentation properly, then every time someone upstream adds the
> -DEIGEN_DEFAULT_TO_ROW_MAJOR flag, their code will break in horrible ways without a single warning.
>
> Also, we've been spoiled by how easy it is to access data using single indexing in our code in the past, so it's very
> likely that there's bits of code out there which misuse the () operator without thinking about how the data is actually
> arranged in memory. I'm thinking routines that accept a matrix as input but do not check if it's column major or row
> major, etc.
>
>>>  I was just wondering whether the altered behavior of the
>>> () operator was intended (in which case, our assumption of
>>> transparency was incorrect), or whether this was an unintended bug.
>>
>> This behavior is intended and not considered 'altered', it just plays
>> at a lower level than where the notions of "col-major" vs "row-major"
>> storage exist.
>>
>> Maybe we'd need to document that more clearly, but it's already
>> explained on page 1 of the tutorial:
>> http://eigen.tuxfamily.org/dox/TutorialMatrixClass.html
>
> I have to admit that I missed that myself. :) Definitely adding this all over the place where () is defined, etc would
> improve things.
>
>
>> "Note that the syntax m(index) is not restricted to vectors, it is
>> also available for general matrices, meaning index-based access in the
>> array of coefficients. This however depends on the matrix's storage
>> order."
>>
>> We need to have class API docs inherit base classes so that they are
>> more usable, and then put a similar explanation there.
>
>
> I am still a bit conflicted on what to tell users. :( Clearly with respect to the documentation, it's a clear "API
> misuse" issue. However, people have been brought up in a certain way, where these type of data accessors make them
> ignorant with respect to the memory layout and other micro optimizations that happen under the hood. I suppose that if
> this is standard with other libraries (did anyone look at it, e.g., boost multi-arrays, etc?) then the justification is
> "that's how everyone is doing it". :) Otherwise it's almost like we would want to shelter them from this lower-level API
> or help them by preventing mistakes like this in the code, and here I can see a few ways of doing that:
>
>  * prevent them from using [] on anything that's not a Vector/Array
>  * scream at them in the documentation that this part of the API is for advanced users - they have to understand and
> know how the matrix data that they receive is arranged in memory, etc (speaking of which, is there a simple way to
> determine that at run-time? I'm thinking: I write a routine that expects data to be column major, and I would like to
> check if the MatrixXf being passed to me is indeed column major or row major)
>  * make []'s behavior safe (and slower?), and add a different accessor for most local-memory. $0.02
>
> Thanks Benoit!
>
> Cheers,
> Radu.
> --
> http://www.pointclouds.org
>



Mail converted by MHonArc 2.6.19+ http://listengine.tuxfamily.org/