|Re: [eigen] Performance colwise matrix mult vs. raw loop|
[ Thread Index |
| More lists.tuxfamily.org/eigen Archives
- To: eigen@xxxxxxxxxxxxxxxxxxx
- Subject: Re: [eigen] Performance colwise matrix mult vs. raw loop
- From: Norbert Wenzel <norbert.wenzel.lists@xxxxxxxxx>
- Date: Tue, 16 Jan 2018 21:40:00 +0100
- Dkim-signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=subject:to:references:from:message-id:date:user-agent:mime-version :in-reply-to:content-language:content-transfer-encoding; bh=LD2i/Gv3AhV0sGFSg5nF7enM87eNEa3pF9DDnqHaHFM=; b=Rmekqtmx8Kq3zvi7n9BuYj1KdTudv+52Wn5gpD8XQwArpJOupcxvvA81CuMHkJx5iH t5LpgQKBamaljJEmXl/aMbbtaOL5OlbU+IIOqTpajXUj1VNaOz5yVlsmVIURVWCV+TiB 9ZDpU3V802AQlTrbXZPdf2CIsZVIB1V+zF7Ric9Uv7TyDJ5YFDUqeL8gvZBSwZcKAJbb PPKO6V9eTJRhG0ka7EQ7Im/GlC8HGnMZyhTRURyz8KfKcRBZIS6mIKF/+ejHtDasats7 pUwm5zuZbY3T3Od5YfSxu0xVKRec0SYxMcwDeNcrYW0sRsni5FsxaiWUF70DQXlS24Js /0ow==
On 2018-01-15 11:54, Gael Guennebaud wrote:
> both code are not equivalent because one is applying the transformation
> in-place, whereas the other one need to allocate a temporary. [...]
Thanks for the explanation.
> [...] the product cannot be carried out in-place because of
> aliasing issue and a temporary has to be created. Of course, as you
> realized, if we evaluate the result one column at once, then instead of
> allocating a whole 3xN temporary, it is enough to allocate a 3x1 temporary
> vector, and since the size is known at compile time and that it is very
> small, it can be "allocated" on the stack and even optimized away by the
> compiler. Unfortunately there is not way for Eigen to figure this out,
> especially at compile-time. [...]
I understand you cannot determine any aliasing issues at compile time.
But I fail to see why there is no way to determine the size of the
temporary. Do you mean this cannot be done in Eigen as it is now or this
can generally not be determined at compile time?
> In your case, we would need some kind of colwise/rowwise noalias [...].
> To be honest I've
> never though about that possibility, but [...] this might be worth the> effort.
Thanks for considering this possibility.
As I've already said this issue was primarily surprising to me because I
thought colwise/rowwise would essentially boil down to the loop code
I've written manually. Your explanation shed some light on this issue
but clearly demonstrated, that I need to understand Eigen internals
better, when I care about runtime performance.