Re: [eigen] Performance difference icc <-> gcc, EIGEN_STRONG_INLINE

[ Thread Index | Date Index | More lists.tuxfamily.org/eigen Archives ]



Defining EIGEN_STRONG_INLINE as __forceinline for ICC was, and is still, required on many places to get ICC do proper inlining. Not perfect yet though. But I'm really surprised by your case for which is seems to be beneficial to downgrade EIGEN_STRONG_INLINE from "__forceinline" to a less aggressive "inline". That's odd to me as this suggest that your 13x slow down would be caused by a too aggressive inlining, so code bloat... Another explanation would be some conflict with other compiler-flags... In either case, as Christoph suggested, it would help a lot to give a quick look at the ASM of your critical part to see what is the key difference in inlining between the fast and slow versions.

Gaël.

On Thu, Mar 14, 2019 at 4:35 PM Christoph Hertzberg <chtz@xxxxxxxxxxxxxxxxxxxxxxxx> wrote:
Hi!

On 14/03/2019 05.08, Michael Riesch wrote:
>> We could easily remove EIGEN_COMP_ICC from the cases where
>> EIGEN_STRONG_INLINE is a __force_inline:
>> https://bitbucket.org/eigen/eigen/src/default/Eigen/src/Core/util/Macros.h#Macros.h-755
>>
>>
>> But I don't know the history of this, i.e., if it was necessary for
>> older ICC versions. Or if it is necessary at other places.
> Same here, hence I wanted to ask whether defining EIGEN_STRONG_INLINE
> has some disadvantages. According to the Intel profiling tools, my
> function that contains all the calculations actually calls the Eigen
> functions (e.g., Eigen::Matrix<...>::Matrix<...> and
> Eigen::Matrix<...>::operator=). I expected that those will be inlined.
> Is this something to worry about?

I think what you finally need to do is to look at the assembly generated
from (the performance critical parts of) your program.
Defining `EIGEN_STRONG_INLINE` as `inline` should be save regarding
functionality, but I can't give any promises on the performance impact
with ICC, especially with different compile flags.

> Maybe I should add that I do not use the -inline-forceinline flag for
> the Intel compiler since this lead to incredibly long compilation times.

Interesting. It could be that (if inactive) this flag implies that
`__forceinline` is essentially evaluated to nothing.
Perhaps for ICC we should do by default:
   #define EIGEN_STRONG_INLINE __forceinline inline


> [...]
> I think "Science" is the best fit, since mbsolve [1] "is an open-source
> solver tool for the Maxwell-Bloch equations, which are used to model
> light-matter interaction in nonlinear optics."

Done!

Cheers,
Christoph


>
> Thanks a lot for any comments!
>
> Regards,
> Michael
>
> [1] https://github.com/mriesch-tum/mbsolve
>
>

--
  Dr.-Ing. Christoph Hertzberg

  Besuchsadresse der Nebengeschäftsstelle:
  DFKI GmbH
  Robotics Innovation Center
  Robert-Hooke-Straße 5
  28359 Bremen, Germany

  Postadresse der Hauptgeschäftsstelle Standort Bremen:
  DFKI GmbH
  Robotics Innovation Center
  Robert-Hooke-Straße 1
  28359 Bremen, Germany

  Tel.:     +49 421 178 45-4021
  Zentrale: +49 421 178 45-0
  E-Mail:   christoph.hertzberg@xxxxxxx

  Weitere Informationen: http://www.dfki.de/robotik
   -------------------------------------------------------------
   Deutsches Forschungszentrum für Künstliche Intelligenz GmbH
   Trippstadter Strasse 122, D-67663 Kaiserslautern, Germany

   Geschäftsführung:
   Prof. Dr. Jana Koehler (Vorsitzende)
   Dr. Walter Olthoff

   Vorsitzender des Aufsichtsrats:
   Prof. Dr. h.c. Hans A. Aukes
   Amtsgericht Kaiserslautern, HRB 2313
   -------------------------------------------------------------





Mail converted by MHonArc 2.6.19+ http://listengine.tuxfamily.org/