[eigen] Re: 4x4 matrix inverse |

[ Thread Index | Date Index | More lists.tuxfamily.org/eigen Archives ]

*To*: eigen <eigen@xxxxxxxxxxxxxxxxxxx>*Subject*: [eigen] Re: 4x4 matrix inverse*From*: Benoit Jacob <jacob.benoit.1@xxxxxxxxx>*Date*: Mon, 14 Dec 2009 23:25:32 -0500*Dkim-signature*: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=gamma; h=domainkey-signature:mime-version:received:in-reply-to:references :date:message-id:subject:from:to:content-type :content-transfer-encoding; bh=uccSg75mtyFq8pnVzOVeQ7r9Rx6gBw3ESZ6ftDrZNTg=; b=Gmt+vEhLPKmaZBOuElM1o5BkUApnn3gwZ/d14lOjXX0BgqutRqMGgI91H5SB2Jm5d8 2eW6hFTd0RrI23JfCrjT6xBw9ihrT8oa/M/sP+0lLn8SnRA7zOUBeAlQxfGnsCCfW5sD IgDb9CEpBZro0ZBeLe34LQBzfR4GaCFmk35mo=*Domainkey-signature*: a=rsa-sha1; c=nofws; d=gmail.com; s=gamma; h=mime-version:in-reply-to:references:date:message-id:subject:from:to :content-type:content-transfer-encoding; b=SGiFN0O6DoQBKA4vb+Xh52NA/zP/QYwBXlcBNCu776oxZACBkBE8Q4elb0v9C/qHRH eoIG0SCBoriI+Na3bOykEHlRb8jmTxDWGaOr0/QdscOSgsZ85r2Cc3YDM5dGBukwBngc Jeu3kVbhlX09OmOrzImKk7Q/iKGTPiJoRa8ks=

Hi, To summarize recent commits: all this is now done in the development branch, it only remains to consider backporting. The SSE code is 4.5x faster than my plain scalar path! I guess that's explained not only by SSE intrinsics but also by better ordering of instructions... There is one thing where I didn't follow Intel's code: they use a RCPSS instruction to compute 1/det approximately, then followed by a Newton-Raphson iteration. This sacrifices up to 2 bits of precision in the mantissa, which already is a bit nontrivial for us (4x4 matrix inversion is a basic operation on which people will rely very heavily). To help solve that dilemma (performance vs precision) I benchmarked it, and it turns out than on my core i7, DIVSS is slightly faster !! Intel's paper was written for the pentium 3 so that's perhaps not surprising, but I saw forum posts mentioning that the RCPSS trick is still faster on the Core2. If you want to test, see lines 128-130 in Inverse_SSE.h. I have a question. I currently get warnings in this code (taken straight from Intel): __m128 tmp1; tmp1 = _mm_loadh_pi(_mm_loadl_pi(tmp1, (__m64*)(src)), (__m64*)(src+ 4)); The warning claims that tmp1 is used uninitalized here. GCC doesn't understand that it only is passed to _mm_loadl_pi that writes into it, does not read from it. How to fix that warning? I tested initializing tmp1, this had a not-totally-negligible impact on performance (because there are 2 more variables that need this). There does not seem to be an __attribute__ for this. Benoit 2009/12/4 Benoit Jacob <jacob.benoit.1@xxxxxxxxx>: > Hi, > > Long ago I thought it would be a good idea to optimize 4x4 matrix > inverse using "Euler's trick" which reduced greatly the number of > operations but relies on some 2x2 block inside the matrix being > invertible. > > The problem is that this gives bad precision, and the best compromise > that I could find between precision and performance is still: > - 10x more imprecise in the worst case > - only 25% faster. > > My last reason to clinge to this approach is that it was supposedly > more vectorizable, but reading this, > ftp://download.intel.com/design/PentiumIII/sml/24504301.pdf > I realized that Intel engineers actually figured how to vectorize the > plain old cofactors approach very efficiently. > > So I'll switch to cofactors in both branches, I think. I'll also > implement SSE at least in the default branch. > > Question: do you think that Intel's code is provided free of use? Or > should I avoid looking at it? Even if I can't look at it, they still > provide good explanations. > > Benoit >

**Follow-Ups**:**[eigen] Re: 4x4 matrix inverse***From:*Benoit Jacob

**Re: [eigen] Re: 4x4 matrix inverse***From:*Hauke Heibel

**Re: [eigen] Re: 4x4 matrix inverse***From:*Gael Guennebaud

**References**:**[eigen] 4x4 matrix inverse***From:*Benoit Jacob

**Messages sorted by:**[ date | thread ]- Prev by Date:
**Re: [eigen] [feature request] precompiled header** - Next by Date:
**[eigen] Re: 4x4 matrix inverse** - Previous by thread:
**[eigen] 4x4 matrix inverse** - Next by thread:
**[eigen] Re: 4x4 matrix inverse**

Mail converted by MHonArc 2.6.19+ | http://listengine.tuxfamily.org/ |