I'd need a bit more details to get any idea what is happening. E.g.:
Are the matrix sizes the same all the time (and how big are they)?
Does cur->i = this->bi involve resizing/memory allocation?
Is 0.4--2.1 seconds the time for all for computations or for each (or for multiple times)?
The easiest way for us would be, if you could produce a minimal working example which exposes the problem.
Cheers,
Christoph

```I have used eigen to do the following computation in a deep learning
programme，
cur->i =
this->bi;
cur->i.noalias() += this->Wxi*xt + this->Whi*prev->h;
cur->f = this->bf;
cur->f.noalias() += this->Wxf*xt + this->Whf*prev->h;
cur->o = this->bo;
cur->o.noalias() += this->Wxo*xt + this->Who*prev->h;
cur->u = this->bu;
cur->u.noalias() += this->Wxu*xt + this->Whu*prev->h;

W* is a matrix, and h, f, i, o, u are all vectors.

But I tried several times with the same input data， the time consuming is
much different.
Some times it costs 0.4 second to get the output, but some times it cost
2.1 second to get the output.

How can I solve this problem to get stable time cost ?

