Dear EIGEN Team and Users,
I'm new here and I have a newbie question. I have implemented a version of Block Conjugate Gradient, to resolve AX=B where A(nxn), X(nxp) and B(nxp).. Then, I compare the walltime of this block version (using p=1) agains Classics CG using just VectorXd. Result: the block version took more time.
Specifically I see that the operation A*V, with V defined as MatrixXd V(n,1) took more time than the operations A*v, with v defined as VectorXd v(n). Is this an expected result?. Can I do something to get approximately more closer results?.
I attached some snapshot from VTune comparing this runs.
Thanks in advance.
Best Regards
Pedro Torres
Block CG
CG Classic
--
Pedro Torres