Re: [eigen] heads-up: ARM prefetch fixes

[ Thread Index | Date Index | More Archives ]

Thank you for the details, sounds good!


On Wed, Mar 15, 2017 at 12:04 PM, Benoit Jacob <jacob.benoit.1@xxxxxxxxx> wrote:

Hope that was OK --- I just pushed the following changeset to default and 3.3 branches:

default branch:

3.3 branch:

This does two things:

 - actually generate prefetch instructions on ARM64. On a Pixel XL Android device, running on 1 big core (Kryo @ 2.15 GHz), 1024x1024 matrix multiplication speed (rowmajor * colmajor -> colmajor, which is what we tend to use in NN applications) is improved by ~ 10% by this change.

 - on ARM32, the asm statement was needlessly clobbering "cc" (condition code). There is nothing in the ARM assembler reference that suggests that this instruction touches condition codes.


Mail converted by MHonArc 2.6.19+