Re: [eigen] gcc bug hit by eigen, workaround proposal |
[ Thread Index |
Date Index
| More lists.tuxfamily.org/eigen Archives
]
I thought further about that, and it seems to me that we'll have to deal with
this problem in the long term:
- the borders of this bug are quite fuzzy, as show some comments on my bug
report. For instance, in plain C, gcc is sometimes able to unroll nested
loops. This makes me fear that whatever the bugfix will be, it will still not
completely eradicate the issue.
- even if gcc 4.3 did completely overcome the issue, I wouldn't like to have
to write in big red letters on Eigen's website "for use only with gcc 4.3 or
later". We should support long-term at the very least gcc 4.1 or later, and
preferably gcc 3.3 or later.
It seems to me that we should go for loop unrolling by template
metaprogramming.
- that's what some other libraries do, like MTL and Blitz++.
- google template metaprogramming unroll loop suggests that there's common
knowledge telling us to go that way.
- some gcc guy told me to do so in a comment to my bug report.
I initially feared it would be impossible to achieve because we'd only want to
unroll loops in the fixed-size version, not in the dynamic-size one. But
actually, the solution to that problem is exactly the same that allowed us to
have fixed-size vs. dynamic-size in the first place: the CRTP!
This deserves to be done carefully, and at the same time I don't want to delay
Eigen 1.0. So I think we'll schedule this for Eigen 1.1 later this winter
(february?), and the ones who need high speed before that will just have to
use SVN.
So here's the current roadmap:
Eigen 0.9.5 around December 20. Need to finish the projective-geometry stuff,
and to add tests for the recently added stuff; other than that it's mostly
done.
Eigen 1.0 around December 30. I expect that the main difference from Eigen
0.9.5 will be documentation, website, and maybe portability fixes.
Eigen 1.1 in February 2007. Will have the template loop unrolling, making it
fast on any gcc. Other features not decided yet, might include quaternions.
Benoit
Le jeudi 14 décembre 2006 19:20, Franz Keferböck a écrit :
> sounds reasonable, but we'd have to keep it in the back of our heads
> that we wanna revert this once gcc's fixed (to make the code readable
> again)
>
> On 14/12/06, Benoit Jacob <jacob@xxxxxxxxxxxxxxx> wrote:
> > Hi List,
> >
> > While measuring Eigen performance I realized that gcc doesn't
> > completely unroll _nested_ loops. More precisely, in case of a
> > nested loop, it only unrolls the inmost loop, while the outer loop
> > gets only partially unrolled.
> >
> > Namely, with fixed-size matrices, Eigen does a lot of nested loops like
> >
> > for( col = 0; col < size(); col++)
> > for( row = 0; row < size(); row++)
> > do_something( row, col );
> >
> > and for fixed-size classes, size() returns the template parameter Size,
> > which is known at compile-time, so one would expect gcc to be able to
> > unroll these nested loops. I always assumed it, and designed Eigen
> > around this assumption. But it's not the case. I filed a bug report here,
> >
> > http://gcc.gnu.org/bugzilla/show_bug.cgi?id=30201
> >
> > and one gcc developer said this would be something for gcc 4.3. Obviously
> > we can't wait for gcc 4.3 to get good performance in Eigen. I measured a
> > 6x speed difference in some methods. So what I propose is to have a
> > system of preprocessor macros manually unrolling loops. Need to think
> > more about that though, but the rough idea is to replace
> >
> > for( col = 0; col < size(); col++)
> > for( row = 0; row < size(); row++)
> > {
> > ...
> > }
> >
> > with
> >
> > for( int foo = 0; foo < size() * size(); foo++)
> > {
> > col = foo / size();
> > row = foo % size();
> >
> > ...
> > }
> >
> > The rationale is that gcc is able to unroll single loops correctly, it
> > only fails to unroll _nested_ loops.
> >
> > I would like to have this sorted out and implemented before 1.0, which
> > means in 2 weeks. I guess that some of us need good performance right
> > now.
> >
> > Benoit