Re: [eigen] Totally missing /O2 for MSVC-Compiler |
[ Thread Index |
Date Index
| More lists.tuxfamily.org/eigen Archives
]
Gael Guennebaud schrieb:
On Fri, Jan 30, 2009 at 1:30 AM, FMDSPAM <fmdspam@xxxxxxxxx> wrote:
Two ctest questions:
hm, three actually ;)
hm , yes :-)
a) the newly approved compiler options will go into the normal cdash suite,
right?
I don't really understand what you meant here ? but FYI I'll make sure
the large_product test and lu test will be always compiled with /O2
for msvc (otherwise they take too long to run).
OK, but why only these both testcases?
a) I would consider to test all under the hardest conditions available.
b) a general use of the compiler flags, causing the best possible
performance is what we want to do in real use cases.
b) how to set compiler (different) options (e.g. for further internal
testing and benchmarking)
This is a pure cmake issue and I don't know what is the best way to
proceed. You can also use the CMAKE_BUILD_TYPE option to control
between release/debug modes. Perhaps we could add a "custom" build
mode where the options would be all set manually using a
EIGEN_CXX_FLAGS variable.
I will take a look inside this and post valuable result, if any.
Having in mind additionally test in regard of e.g. the /fp flag in terms
of correctness and performance.
I all is going fine, I consider to cdash several configuration on a
regular basis. Objections ?
c) how to run ctest almost completely but without upload. (I don't want to
make (so much) cdash noise)
you do it manually, just as usual (cmake ; make ; ctest -v )
You are right, thank you.
Cheers
Frank
FYI:
Your original form "process initial unaligned coeffs"
cachfriendlyproduct.h will to the job now perfectly, even with "complete
optimization" (/Ox) flag set for all use cases. I have attached the
revert patch. Maybe the your original form is cleaner and should be
recreated?
Index: Eigen/src/Core/CacheFriendlyProduct.h
===================================================================
--- Eigen/src/Core/CacheFriendlyProduct.h (revision 918759)
+++ Eigen/src/Core/CacheFriendlyProduct.h (working copy)
@@ -435,13 +435,8 @@
{
/* explicit vectorization */
// process initial unaligned coeffs
- for (int j=0; j<alignedStart; ++j) {
- Scalar s = ei_pfirst(ptmp0)*lhs0[j];
- s += ei_pfirst(ptmp1)*lhs1[j];
- s += ei_pfirst(ptmp2)*lhs2[j];
- s += ei_pfirst(ptmp3)*lhs3[j];
- res[j] += s;
- }
+ for (int j=0; j<alignedStart; ++j)
+ res[j] += ei_pfirst(ptmp0)*lhs0[j] + ei_pfirst(ptmp1)*lhs1[j] + ei_pfirst(ptmp2)*lhs2[j] + ei_pfirst(ptmp3)*lhs3[j];
if (alignedSize>alignedStart)
{