Re: [eigen] Issues regarding Quaternion-alignment and const Maps |
[ Thread Index |
Date Index
| More lists.tuxfamily.org/eigen Archives
]
- To: eigen@xxxxxxxxxxxxxxxxxxx
- Subject: Re: [eigen] Issues regarding Quaternion-alignment and const Maps
- From: Benoit Jacob <jacob.benoit.1@xxxxxxxxx>
- Date: Fri, 9 Jul 2010 17:07:27 -0400
- Dkim-signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=gamma; h=domainkey-signature:mime-version:received:received:in-reply-to :references:date:message-id:subject:from:to:content-type :content-transfer-encoding; bh=hswobtb1kl80NUasIHrR85Og+YQG3oDwNBb3WSFk99s=; b=hdhJXJZ6onk316RJGPMFILxu48yy7A9jNYVeoBaVJ66YYut4jpW0H0tD7yMx4t+Rst 21wG2WCZJGthZTCU/M1sUnHl5vQttvOL0MjDpVakXhA0shSSo2pJxzDjv4VwJ6cImpoU iqu+Rls0vFA7K3IpDYesUdAIuZev00UCBWJBE=
- Domainkey-signature: a=rsa-sha1; c=nofws; d=gmail.com; s=gamma; h=mime-version:in-reply-to:references:date:message-id:subject:from:to :content-type:content-transfer-encoding; b=rNRuq9loF/fjaDdYAXNCAr1UMqPLFT5hJo4wbgH0mx+yl6cK7Jhoh/VkBohn/c07kL pF4vQBE+4uNKoC61tsMx5WJDMDsI3+RtntP62yHUWWtgfE2TRa+DIDbFWG3HiAd05RzF pV5AZVXueJb2dnV03OGygGVyV90oMxLYW74Hg=
Wow, very good work.
I indeed confirm the 2x speed improvement, and once i moved your
benchmarking code to a non-inlinable function called from main(), it
even got a bit higher (indeed GCC fails to optimize correctly code in
the main() function).
Could you make a patch against the development branch? (We're not
going to add features to 2.0 at this point).
http://eigen.tuxfamily.org/index.php?title=Developer%27s_Corner#Generating_a_patch
Also, I didn't know about that loaddup instruction in SSE3. It's
great! I'll have a look at using it in ei_pset1 when SSE3 is
available.
Benoit
2010/7/9 Christoph Hertzberg <chtz@xxxxxxxxxxxxxxxxxxxxxxxx>:
> Benoit Jacob schrieb:
>>> and (quaternion * quaternion) only for float (at least in
>>> Eigen2).
>>
>> That's still the case. SSE wouldn't bring a very big benefit for
>> doubles here, I'm afraid.
>
> It does, if SSE3 is available.
> Using this program I just hacked together, (after spending half a day
> finding some useful documentation about SSE-intrinsics ...):
> http://www.informatik.uni-bremen.de/~chtz/eigen_quat_sse.cc
>
> I got a speed-up of more than 2 (no thorough benchmarking though, also
> no unit-testing ...)
> I didn't try how much one could gain with just SSE2 (the _mm_addsub_pd
> comes in very handy :) )
>
> ===== TIMINGS =====
> ~/workspace/test> g++ -msse2 -I../../eigen -O2 eigen_quat_sse.cc
> ~/workspace/test> time ./a.out
> 0.1572 0.3143 0.4715 -0.8089
> real 0m3.182s
> user 0m3.156s
> sys 0m0.004s
> ~/workspace/test> g++ -msse3 -I../../eigen -O2 eigen_quat_sse.cc
> ~/workspace/test> time ./a.out
> 0.1572 0.3143 0.4715 -0.8089
> real 0m1.481s
> user 0m1.472s
> sys 0m0.000s
>
>
>
> --
> ----------------------------------------------
> Dipl.-Inf. Christoph Hertzberg
> Cartesium 0.051
> Universität Bremen
> Enrique-Schmidt-Straße 5
> 28359 Bremen
>
> Tel: (+49) 421-218-64252
> ----------------------------------------------
>
>
>