Re: [eigen] Question about ColMajor vs. RowMajor

[ Thread Index | Date Index | More lists.tuxfamily.org/eigen Archives ]

To: "Radu B. Rusu" <rusu@xxxxxxxxxxxxxxxx>
Subject: Re: [eigen] Question about ColMajor vs. RowMajor
From: Benoit Jacob <jacob.benoit.1@xxxxxxxxx>
Date: Sun, 18 Dec 2011 18:12:07 -0500
Cc: eigen@xxxxxxxxxxxxxxxxxxx, Michael Dixon <mdixon@xxxxxxxxxxxxxxxx>, Suat Gedikli <gedikli@xxxxxxxxxxxxxxxx>
Dkim-signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=gamma; h=mime-version:in-reply-to:references:date:message-id:subject:from:to :cc:content-type:content-transfer-encoding; bh=GTVlwnWligZMHXonovQeC1hwJgPNj8KU0l5pN8fEQCo=; b=uY09fnNTsesgScteDESFn0zQNxPiioRPLhqBBiZ3gGyrwPbO61BmWismaAMlLQgzi0 50A6ycdSZNlQeEKRuXF+I3odwX8eVPv+7dtpx86a6IAwt5f6D8UFlvD0VBlYjeHsDNXL Ds+er7QI0RvhgE7/D/IopAXO24KDBj7MT0J6s=

One could also consider deprecating operator()(index) for non-vectors,
as maybe it doesn't make it explicit enough that it's dependent on the
storage order (whereas "data()" should ring a bell)...  and instead
expose it under a different name making it clearer, but I am reluctant
to do that as that would cause a lot of trouble to existing users; I
would have to be convinced that the amount of trouble it helps avoid,
it greater.

Benoit

2011/12/18 Benoit Jacob <jacob.benoit.1@xxxxxxxxx>:
> Thanks for your comments.
>
> I would like to stress that this issue is not specific to
> operator()(index) and operator[]; it also exists with the data()
> method returning a pointer to the underlying array of coefficients.
>
> First of all, I think that it would be even more surprising if
>
>   matrix(index)
>
> wasn't the same thing as
>
>   matrix.data()[index]
>
> This means that the current behavior of operator()(index) is the only
> possible one, and that changing it to e.g. always mean column-major
> would be even more surprising, as it would make the above two
> expressions return different values in the row-major case.
>
> So I think that this is a documentation issue. There is lots that we
> should do to make our API docs more useful, like, as I mentioned, let
> class docs include the docs of all parent classes.
>
> Cheers,
> Benoit
>
>
> 2011/12/18 Radu B. Rusu <rusu@xxxxxxxxxxxxxxxx>:
>> Benoit,
>>
>> Thanks a lot for the explanations!
>>
>>
>> On 12/17/2011 07:21 AM, Benoit Jacob wrote:
>>> 2011/12/16 Michael Dixon <mdixon@xxxxxxxxxxxxxxxx>:
>>>>  Hi,
>>>>
>>>> My colleagues and I were recently playing around with changing the
>>>> default for matrices from ColMajor to RowMajor, and we found that
>>>> doing so changed the behavior of the () operator.  Specifically, when
>>>> accessing a two-dimensional matrix with a single index (e.g., mat(i)
>>>> ), the ColMajor/RowMajor flag affects which element of the matrix gets
>>>> indexed as the ith.  Our assumption had been that changing the
>>>> ColMajor/RowMajor flag would be completely transparent, which is to
>>>> say, it would affect the underlying storage (and thus the
>>>> performance/efficiency), but it wouldn't change the overall behavior
>>>> of the code.
>>>
>>> This is mostly true, but it doesn't apply to the few parts of the API
>>> that are explicitly about low-level addressing of the underlying array
>>> of coefficients as a single-dimensional array. operator(index) is like
>>> that. It is meant for use cases where you just want to iterate over
>>> coefficients in the most memory-local way, and don't care if that
>>> means col-major or row-major iteration.
>>
>> Though I completely see the reasoning behind this, I am a bit worried that most people won't be able to differentiate
>> between "low-level" API and the rest. It's almost like one would want the operator that is used the most to have the
>> "slower but safer" behavior (if it's a choice and if its behavior is expected to change depending on compile flags), and
>> then have another way to get data faster (i.e., most memory-local) that is explicit. For example, if people misuse the
>> () operator and are not reading the documentation properly, then every time someone upstream adds the
>> -DEIGEN_DEFAULT_TO_ROW_MAJOR flag, their code will break in horrible ways without a single warning.
>>
>> Also, we've been spoiled by how easy it is to access data using single indexing in our code in the past, so it's very
>> likely that there's bits of code out there which misuse the () operator without thinking about how the data is actually
>> arranged in memory. I'm thinking routines that accept a matrix as input but do not check if it's column major or row
>> major, etc.
>>
>>>>  I was just wondering whether the altered behavior of the
>>>> () operator was intended (in which case, our assumption of
>>>> transparency was incorrect), or whether this was an unintended bug.
>>>
>>> This behavior is intended and not considered 'altered', it just plays
>>> at a lower level than where the notions of "col-major" vs "row-major"
>>> storage exist.
>>>
>>> Maybe we'd need to document that more clearly, but it's already
>>> explained on page 1 of the tutorial:
>>> http://eigen.tuxfamily.org/dox/TutorialMatrixClass.html
>>
>> I have to admit that I missed that myself. :) Definitely adding this all over the place where () is defined, etc would
>> improve things.
>>
>>
>>> "Note that the syntax m(index) is not restricted to vectors, it is
>>> also available for general matrices, meaning index-based access in the
>>> array of coefficients. This however depends on the matrix's storage
>>> order."
>>>
>>> We need to have class API docs inherit base classes so that they are
>>> more usable, and then put a similar explanation there.
>>
>>
>> I am still a bit conflicted on what to tell users. :( Clearly with respect to the documentation, it's a clear "API
>> misuse" issue. However, people have been brought up in a certain way, where these type of data accessors make them
>> ignorant with respect to the memory layout and other micro optimizations that happen under the hood. I suppose that if
>> this is standard with other libraries (did anyone look at it, e.g., boost multi-arrays, etc?) then the justification is
>> "that's how everyone is doing it". :) Otherwise it's almost like we would want to shelter them from this lower-level API
>> or help them by preventing mistakes like this in the code, and here I can see a few ways of doing that:
>>
>>  * prevent them from using [] on anything that's not a Vector/Array
>>  * scream at them in the documentation that this part of the API is for advanced users - they have to understand and
>> know how the matrix data that they receive is arranged in memory, etc (speaking of which, is there a simple way to
>> determine that at run-time? I'm thinking: I write a routine that expects data to be column major, and I would like to
>> check if the MatrixXf being passed to me is indeed column major or row major)
>>  * make []'s behavior safe (and slower?), and add a different accessor for most local-memory. $0.02
>>
>> Thanks Benoit!
>>
>> Cheers,
>> Radu.
>> --
>> http://www.pointclouds.org
>>

References:
- [eigen] Question about ColMajor vs. RowMajor
  - From: Michael Dixon
- Re: [eigen] Question about ColMajor vs. RowMajor
  - From: Benoit Jacob
- Re: [eigen] Question about ColMajor vs. RowMajor
  - From: Radu B. Rusu
- Re: [eigen] Question about ColMajor vs. RowMajor
  - From: Benoit Jacob

Messages sorted by: [ date | thread ]
Prev by Date: Re: [eigen] Question about ColMajor vs. RowMajor
Next by Date: [eigen] Eigen Benchmark (number of FLOPS)
Previous by thread: Re: [eigen] Question about ColMajor vs. RowMajor
Next by thread: Re: [eigen] Question about ColMajor vs. RowMajor

Mail converted by MHonArc 2.6.19+

http://listengine.tuxfamily.org/