[eigen] heads-up: ARM prefetch fixes

[ Thread Index | Date Index | More lists.tuxfamily.org/eigen Archives ]


Hi,

Hope that was OK --- I just pushed the following changeset to default and 3.3 branches:

default branch: https://bitbucket.org/eigen/eigen/commits/0974c5e72c12891855a2e01dd886c21e881fd310

3.3 branch: https://bitbucket.org/eigen/eigen/commits/6ae8b07ef7d4787836bfaf7c599ecd2134f49f30

This does two things:

 - actually generate prefetch instructions on ARM64. On a Pixel XL Android device, running on 1 big core (Kryo @ 2.15 GHz), 1024x1024 matrix multiplication speed (rowmajor * colmajor -> colmajor, which is what we tend to use in NN applications) is improved by ~ 10% by this change.

 - on ARM32, the asm statement was needlessly clobbering "cc" (condition code). There is nothing in the ARM assembler reference that suggests that this instruction touches condition codes. http://infocenter.arm.com/help/index.jsp?topic=/com.arm.doc.dui0068b/Chdjffbi.html

Cheers,
Benoit


Mail converted by MHonArc 2.6.19+ http://listengine.tuxfamily.org/