Re: [eigen] RFC: making a deterministic and reproducable product codepat

Ramus,

It depends. For instance, are we talking about small/tiny matrices? If that's the case, coefficient based algorithms are already the fastest implementations - vectorization is something to ponder whether to have on or off and the packetsize's implications. You can have that in parallel/multi-threaded as well as a building block - however I'm guessing you're thinking of just large matrix mults - performance can still be good, but yes, we're not going to get the highest theoretical throughput. It's not a problem understanding of mathematics in ones solution, it's a problem of reproducability - to always get the same answer given the same inputs has alot of strength in it than the error analysis when we're talking about the computation part in terms of fault detection and system consistency. Error analysis qualifies how good and useful the answers are to the application (which also matters alot!) - but they shouldn't be expected to be substitutes for eachother. Also, tracking an error bound in large/complex software is quickly intractable - it is reasonable to do this for single routines but it would be very expensive (more so than reproducable software) to do this for all software written we'd want to understand in a larger system.

-Jason

On Fri, Sep 9, 2016 at 11:38 AM, Rasmus Larsen <rmlarsen@xxxxxxxxxx> wrote:

Just to throw in my 2 cents (mostly in the context of linear algebra): I understand that some perceive a great benefit of having 100% reproducible and deterministic computations, but I don't think it is realistic or even particularly useful in a parallel or multi-threaded environment, unless you want to grossly sacrifice performance. In my practical experience, this idea often comes from a desire to get predictable outcomes of regressions tests written without proper knowledge of the mathematics of the problem, i.e. error analysis. I have found time and time again that looking up the correct error bound in the literature (or deriving it) and possibly adding a small fudge factor solves such problems in a way that might even yield useful insights.

On Fri, Sep 9, 2016 at 2:15 AM, Christoph Hertzberg <chtz@xxxxxxxxxxxxxxxxxxxxxxxxx> wrote:
On 08.09.2016 at 21:45, Peter wrote:

In case you are interested, there's e.g. HP's Dynamo project,
<http://www.hpl.hp.com/techreports/1999/HPL-1999-77.html>,
which messes around with binaries. And for scalar products, it's
sufficient to change the order of evaluation,
to loose bit-wise accuracy, eg. the scalar product of ( 1, 1e-50, 1)
with ( 1, 1, -1 ) is a simple example.

Sure, I'm aware that IEEE math is non-associative ...

I'm just not sure how far the processors mess around.

By design they should only execute things out of order, if the instructions (or the generated micro-instructions) are independent.
Everything beyond that would be insane, IMO. I'm not a CPU expert, but I'm pretty sure a lot of people would have complaining about it, if CPUs would do that.

I agree, using a F77 BLAS should be sufficient. Although I still don't
understand what one learns from bypassing all optimizations.

If you have a BLAS implementation that does exactly the same on every target architecture, you should be fine as well, of course. I don't know what the status of F77->GPU compilers is.

If correctness is important one should switch to exact scalar product,
like in C-XSC,
which removes the dependence on the order of evaluation and just _has_
to provide the same result everywhere.

The original RFC was just on reproducibility not on exactness, I think.

BTW, exact scalar products could be an interesting extension to Eigen in
some future version,
opening the door to verified computing.

Sure, that would be interesting. I'm not sure how complicated this will be to integrate though. And it will certainly be significantly slower.

Christoph

--
Dipl. Inf., Dipl. Math. Christoph Hertzberg

Universität Bremen
FB 3 - Mathematik und Informatik
AG Robotik
Robert-Hooke-Straße 1
28359 Bremen, Germany

Zentrale: +49 421 178 45-6611

Besuchsadresse der Nebengeschäftsstelle:
Robert-Hooke-Straße 5
28359 Bremen, Germany

Tel.: +49 421 178 45-4021
Empfang: +49 421 178 45-6600
Fax: +49 421 178 45-4150
E-Mail: chtz@xxxxxxxxxxxxxxxxxxxxxxxx

Weitere Informationen: http://www.informatik.uni-bremen.de/robotik