Re: [eigen] a record for Eigen: 250 GFLOPS !! |
[ Thread Index |
Date Index
| More lists.tuxfamily.org/eigen Archives
]
- To: eigen@xxxxxxxxxxxxxxxxxxx
- Subject: Re: [eigen] a record for Eigen: 250 GFLOPS !!
- From: James Cloos <cloos@xxxxxxxxxxx>
- Date: Fri, 02 Jul 2010 14:48:54 -0400
- Copyright: Copyright 2009 James Cloos
- Dkim-signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=jhcloos.com; s=eagle; t=1278096627; bh=T7reFzdJCW7cdimrteIRU7gtnggd9GxQMqDBF6Dc0mo=; h=From:To:Subject:In-Reply-To:References:Date:Message-ID: MIME-Version:Content-Type; b=d3H+B9bvLj/mtE1VjvipMZmU8Daho4G4ERIVKwxgfa/bPmQpO4EzGS84SrFJVi8t+ q+8Xdud4arFyXEKm84WzD5B5EXWgOUJxZz1FPFWI54XXKy6s6lzX7GqRRR8MRhCZAV AyS9zrPMkOiyu8W9sgN4PIF4kbSGB9Nv+XElpEOo=
- Face: iVBORw0KGgoAAAANSUhEUgAAABAAAAAQCAYAAAAf8/9hAAAABHNCSVQICAgIfAhkiAAAAI1J REFUOE+lU9ESgCAIg64P1y+ngUdxhl5H8wFbbM0OmUiEhKkCYaZThXCo6KE5sCbA1DDX3genvO4d eBQgEMaM5qy6uWk4SfBYfdu9jvBN9nSVDOKRtwb+I3epboOsOX5pZbJNsBJFvmQQ05YMfieIBnYX FK2N6dOawd97r/e8RjkTLzmMsiVgrAoEugtviCM3v2WzjgAAAABJRU5ErkJggg==
- Openpgp: ED7DAEA6; url=http://jhcloos.com/public_key/0xED7DAEA6.asc
- Openpgp-fingerprint: E9E9 F828 61A4 6EA9 0F2B 63E7 997A 9F17 ED7D AEA6
>>>>> "GG" == Gael Guennebaud <gael.guennebaud@xxxxxxxxx> writes:
GG> Hi, this morning I played with a 48 cores AMD SMP server (8
GG> processors AMD-Opteron-8439-SE, 6 cores each @ 2,8 GHz) and a
GG> bi-processor made of Intel X5570 @ 2.93GHz (4 multithreaded cores
GG> each => a total of 8 cores, 16 threads), and here are the results
GG> for a product of 2048^2 matrices of floats:
GG> We can see that AMD's SSE implementation is half the speed of Intel's one.
I beleive the X5570 (Nehalem/Gainestown) is new enough to have 128 bit
fp adders and multipliers, whereas the 8439 (Istanbul) probably still
has 64 bit adders and mupltipliers. That would explain the performance
difference you noticed with SSE floats. Or, perhaps, the Istanbul may
need to serialize the ops not just with doubles but even with floats?
I'm not sure whether AMD's Magny Cours processors have the 128 bit fp
units, but I've read that the upcoming Bulldozer arch definitely will.
-JimC
--
James Cloos <cloos@xxxxxxxxxxx> OpenPGP: 1024D/ED7DAEA6