|Re: [eigen] Architecture specific performance optimizations|
[ Thread Index |
| More lists.tuxfamily.org/eigen Archives
- To: eigen@xxxxxxxxxxxxxxxxxxx
- Subject: Re: [eigen] Architecture specific performance optimizations
- From: Jason Newton <nevion@xxxxxxxxx>
- Date: Sun, 9 Mar 2014 06:16:26 -0700
- Dkim-signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113; h=mime-version:in-reply-to:references:date:message-id:subject:from:to :content-type; bh=9qEVXCI6YSzylFfuD/te8V6tzoLjXP2Bq9NVNoJn5Go=; b=FWUYA3kfKwJW6SY4gHbo4n3iZL0GbXDxxe8TBvaQv2/pyPN7Pozn2OfAoF987swDba upNeZGyZbpa/XilU9EsqRe+orb2PfGSSEWnc4nzULrH+H/kkDZN4ioZZyhKiGgyddJLz XtNtkXhNT1SZlb8Ow1Y7x83g3Bh7jvZ5kO5B6mE02IZnlT0s8Grk6MlkKTX0kLHD6cwa G8xi3IW0/b7p9ivgJ/lB4IZ+Zkw1dJhcPnTjhs8MqfJk/HMX3nULRF539VVf8Hjb4wTb BuwV1tNQ1jsmyyQw11+HY4W1JIF1uSXFDE4aSvW6pn7xVyuuYGI2hpp90KuyqD5Vl61u 1i5w==
I use eigen heavily in my dayjob - camera related math, filtering/state estimation problems. Some of our guys do the point cloud stuff too. We put these things together into embedded systems (the kinds with recent I7's backing them) and the matrix multiplications from things like kalman filters are one of the things have have to carefully manage due to the number of them and their dimensionality. One single line ends up taking like 30% of the over all program and it consists of 2 matrix mults.
So WRT us: avx/avx2 I'd be more than happy to run tests and benchmarks on this if you needed additional datapoints or didn't have a cpu with those extensions.
It'd be nice if there was support for AMD's OpenCL static C++ extensions, much like the CUDA ones Gael added a while ago - not sure how much trouble it would be to bring it there - I suspect not too bad because both OpenCL and CUDA have most of the same limitations. I may tear into it someday in the next year...