Skip to content

DGEMM regression on SkylakeX #1955

Closed
Closed
@staticfloat

Description

It looks like 45fe8cb has created a regression in Julia's pinv() calculations on SkylakeX. In particular, creating a Hilbert matrix of size 1000 x 100 and asking for the pseudo-inverse now calculates the wrong thing:

using LinearAlgebra

function hilb(T::Type, m::Integer, n::Integer)
    a = Matrix{T}(undef, m, n)
    for i=1:n
        for j=1:m
            a[j,i]=one(T)/(i+j-one(T))
        end
    end
    return a
end
hilb(m::Integer, n::Integer) = hilb(Float64,m,n)

a = hilb(1000, 100)
apinv = pinv(a)

Including the SkylakeX kernel gives the following answer:

100×1000 Array{Float64,2}:
  2.57526e6   -2.33247e6    2.21848e6    2.19307e6   -4.13046e6   …  -4.71439e5   -6.80621e5   -6.56864e5   -8.6676e5    -3.86363e5
 -1.22338e11  -2.20992e11   2.36372e11  -1.14835e11  -9.1049e10       1.13475e10   8.51702e9    3.51379e9    1.54455e10   2.8167e8
  2.45922e11   3.06366e11  -3.45368e11   6.99788e10   1.34305e11     -2.0333e10   -1.72032e10  -6.35715e9   -3.15537e10   3.42079e9
 -1.98151e10  -5.04668e10   6.22131e9   -4.37235e10  -3.29137e9       2.4302e9     3.26304e9    2.4001e9     5.31362e9    1.42072e8
 -3.96966e11   1.59586e10  -1.05208e11  -3.27214e11   3.74498e11      5.9171e10    8.17795e10   7.30775e10   1.11105e11   3.58937e10
 -1.15417e11  -1.8089e10    3.02927e10  -7.71434e10   6.99771e10  …   1.55784e10   1.9823e10    1.69518e10   2.74274e10   8.50587e9
 -7.91383e11   3.19284e11  -5.46209e11  -6.27492e11   1.00493e12      1.26433e11   1.8369e11    1.69215e11   2.44706e11   8.48811e10
 -1.12133e11  -3.60367e10  -1.90203e8   -9.61073e10   7.40389e10      1.55288e10   2.07537e10   1.779e10     2.90644e10   7.91371e9
  4.19346e11  -2.37896e11   3.70455e11   3.44512e11  -6.0224e11      -6.9822e10   -1.03364e11  -9.66593e10  -1.36339e11  -4.94829e10
 -9.51913e9   -1.29776e11   1.82371e11   1.45922e10  -1.32419e11     -3.82507e9   -1.03485e10  -1.22032e10  -1.15361e10  -6.94914e9
  2.81647e12  -3.70604e11   9.33114e11   2.35652e12  -2.88098e12  …  -4.29963e11  -5.98574e11  -5.4097e11   -8.06547e11  -2.73127e11
  2.23042e11  -1.50478e11   1.87991e11   1.81127e11  -3.31279e11     -3.78415e10  -5.5537e10   -5.24234e10  -7.25081e10  -2.8155e10
  8.06723e11  -3.62461e11   6.16052e11   6.57412e11  -1.0719e12      -1.30794e11  -1.91569e11  -1.77425e11  -2.54474e11  -8.93792e10
  5.53103e10  -1.38396e10  -1.52335e10   3.90026e10  -4.70444e10     -8.49163e9   -1.06612e10  -9.72492e9   -1.3894e10   -6.12903e9
  9.43701e11  -5.29597e11   7.95135e11   7.22863e11  -1.31683e12     -1.54868e11  -2.28251e11  -2.12445e11  -3.0164e11   -1.08025e11
 -4.21561e11   1.74302e10  -1.05198e11  -3.6234e11    4.02288e11  …   6.38291e10   8.7825e10    7.88592e10   1.18726e11   3.97235e10
  ⋮                                                               ⋱   ⋮
  6.68906e10  -6.53051e9    3.21447e10   4.33472e10  -6.42927e10     -9.61282e9   -1.3713e10   -1.20454e10  -1.88794e10  -5.21562e9
 -4.3416e10    9.89228e9   -1.06994e10  -3.73798e10   4.63849e10  …   6.77796e9    9.31398e9    8.54109e9    1.23867e10   4.64291e9
 -3.36704e10  -2.19847e10   3.36634e10  -2.54336e10   4.56403e9       4.11065e9    4.50039e9    3.55012e9    6.46187e9    1.91442e9
  1.00752e11  -9.78859e10   1.5892e11    7.5809e10   -1.8612e11      -1.78414e10  -2.82659e10  -2.68784e10  -3.70109e10  -1.31011e10
 -6.24935e11   1.72017e11  -3.18826e11  -5.14607e11   7.21608e11      9.79871e10   1.39412e11   1.27482e11   1.86508e11   6.45552e10
 -3.56138e11   4.75922e10  -1.07088e11  -2.96353e11   3.60894e11      5.43011e10   7.52645e10   6.80322e10   1.01325e11   3.46671e10
  1.41477e11  -1.55157e11   2.42801e11   1.08982e11  -2.78805e11  …  -2.57232e10  -4.11459e10  -3.94458e10  -5.35713e10  -1.95093e10
 -1.71703e11   6.41864e9   -2.29926e10  -1.45472e11   1.56983e11      2.57486e10   3.4882e10    3.13021e10   4.71032e10   1.62219e10
 -1.57418e11  -2.50531e10   1.42196e10  -1.16675e11   1.07792e11      2.18507e10   2.86411e10   2.47383e10   3.9572e10    1.19948e10
 -5.09849e11   1.45783e11  -2.73511e11  -3.97073e11   5.85682e11      7.9147e10    1.13045e11   1.02973e11   1.5164e11    5.11554e10
 -2.01401e11   9.45476e10  -1.55585e11  -1.48556e11   2.6349e11       3.21347e10   4.71454e10   4.34308e10   6.28048e10   2.14634e10
  6.55703e11  -1.91764e11   3.65181e11   5.46763e11  -7.75768e11  …  -1.03493e11  -1.481e11    -1.35717e11  -1.97988e11  -6.84949e10
  4.39019e11  -7.20111e10   1.66746e11   3.76386e11  -4.6781e11      -6.78667e10  -9.50602e10  -8.63599e10  -1.27719e11  -4.38774e10
 -1.35314e11   1.43088e11  -2.17464e11  -8.70376e10   2.51535e11      2.36837e10   3.76675e10   3.57591e10   4.93057e10   1.72898e10
  2.95696e10   5.86712e10  -7.47343e10   2.73326e10   3.03537e10     -2.52465e9   -1.2501e9    -4.13464e7   -2.64329e9    3.99575e7
 -7.70318e10  -3.93945e10   4.41432e10  -8.97948e10   4.01424e10      1.12653e10   1.37315e10   1.20607e10   1.87392e10   7.00317e9

Excluding the SkylakeX kernel (e.g. reverting to 544b069) gives the result:

100×1000 Array{Float64,2}:
     112.527      -6192.3         1.06925e5   -8.28373e5    3.21394e6   -6.01292e6   …  -2.99287e5   -3.02032e5   -3.04795e5   -3.07576e5
   -6305.8            4.64899e5  -9.07773e6    7.54681e7   -3.07426e8    5.99356e8       3.28027e7    3.31027e7    3.34047e7    3.37085e7
       1.1309e5      -9.42656e6   1.9735e8    -1.71896e9    7.25526e9   -1.46068e10     -8.71604e8   -8.79551e8   -8.8755e8    -8.95596e8
      -9.32272e5      8.33527e7  -1.82785e9    1.64741e10  -7.1497e10    1.47819e11      9.57181e9    9.65882e9    9.74639e9    9.83447e9
       3.98657e6     -3.73868e8   8.48896e9   -7.86389e10   3.49436e11  -7.39605e11     -5.19571e10  -5.24279e10  -5.29016e10  -5.33781e10
      -8.8007e6       8.57783e8  -2.00715e10   1.90643e11  -8.66324e11   1.8764e12   …   1.44167e11   1.45468e11   1.46778e11   1.48094e11
       7.90418e6     -8.06081e8   1.95704e10  -1.91875e11   8.97794e11  -2.00535e12     -1.75621e11  -1.77197e11  -1.78783e11  -1.80377e11
       2.40961e6     -2.157e8     4.61896e9   -3.98835e10   1.62513e11  -3.0448e11      -1.76662e9   -1.79326e9   -1.82037e9   -1.84804e9
      -5.54279e6      5.75778e8  -1.41936e10   1.40992e11  -6.67618e11   1.50955e12      1.37796e11   1.39031e11   1.40274e11   1.41523e11
      -4.00904e6      3.9166e8   -9.13369e9    8.60798e10  -3.86441e11   8.21383e11      5.04267e10   5.08898e10   5.13561e10   5.18251e10
       1.63339e6     -1.8959e8    5.10176e9   -5.45604e10   2.76223e11  -6.69193e11  …  -8.18735e10  -8.25983e10  -8.33273e10  -8.40596e10
       4.57405e6     -4.75446e8   1.17311e10  -1.16639e11   5.52734e11  -1.25049e12     -1.12594e11  -1.13605e11  -1.14622e11  -1.15644e11
       3.29825e6     -3.27272e8   7.73331e9   -7.3721e10    3.34416e11  -7.18287e11     -4.43866e10  -4.47954e10  -4.52068e10  -4.56208e10
  -37289.1            2.26601e7  -9.77279e8    1.3636e10   -8.32615e10   2.3651e11       4.70926e10   4.75026e10   4.79148e10   4.83286e10
      -2.81123e6      3.03789e8  -7.75788e9    7.96192e10  -3.89208e11   9.11377e11      9.80715e10   9.89445e10   9.98226e10   1.00705e11
      -3.65969e6      3.80616e8  -9.39927e9    9.35385e10  -4.43642e11   1.00441e12  …   8.90641e10   8.98655e10   9.06717e10   9.14823e10
       ⋮                                                                 ⋮           ⋱
      -5.52909e5      6.12481e7  -1.61047e9    1.70918e10  -8.69055e10   2.14183e11      3.86899e10   3.90212e10   3.93541e10   3.96884e10
 -992608.0            1.08145e8  -2.79998e9    2.92745e10  -1.46579e11   3.54876e11  …   5.63476e10   5.68342e10   5.73232e10   5.78142e10
      -1.35946e6      1.47079e8  -3.78272e9    3.92898e10  -1.95373e11   4.69163e11      6.97768e10   7.03821e10   7.09905e10   7.16015e10
      -1.59876e6      1.72258e8  -4.41282e9    4.56535e10  -2.26067e11   5.40144e11      7.69398e10   7.76094e10   7.82824e10   7.89583e10
      -1.68423e6      1.80867e8  -4.61853e9    4.76281e10  -2.35036e11   5.59247e11      7.6759e10    7.74288e10   7.81023e10   7.87786e10
      -1.5861e6       1.69768e8  -4.32122e9    4.44178e10  -2.18431e11   5.17537e11      6.82601e10   6.88577e10   6.94585e10   7.0062e10
      -1.28093e6      1.36538e8  -3.46127e9    3.54304e10  -1.73446e11   4.08663e11  …   5.09162e10   5.13641e10   5.18145e10   5.2267e10
      -7.85551e5      8.29992e7  -2.08563e9    2.1155e10   -1.02525e11   2.38529e11      2.56914e10   2.59206e10   2.61511e10   2.63827e10
      -1.22319e5      1.1641e7   -2.59973e8    2.29043e9   -9.22435e9    1.59028e10     -5.87688e9   -5.92236e9   -5.96796e9   -6.01363e9
       6.34345e5     -6.95395e7   1.8113e9    -1.90533e10   9.60279e10  -2.3435e11      -4.02598e10  -4.06052e10  -4.09522e10  -4.13007e10
       1.37734e6     -1.49026e8   3.83373e9   -3.98355e10   1.98204e11  -4.76404e11     -7.24162e10  -7.30427e10  -7.36725e10  -7.43049e10
       1.94231e6     -2.09213e8   5.35875e9   -5.54398e10   2.74569e11  -6.56287e11  …  -9.49885e10  -9.58134e10  -9.66426e10  -9.74753e10
       2.11244e6     -2.26877e8   5.79482e9   -5.97807e10   2.95167e11  -7.02918e11     -9.83543e10  -9.92107e10  -1.00072e11  -1.00936e11
       1.59804e6     -1.71043e8   4.3541e9    -4.47652e10   2.20217e11  -5.22078e11     -6.99398e10  -7.0551e10   -7.11654e10  -7.17825e10
   23549.4           -1.60268e6   1.77704e7    5.98452e7   -1.59272e9    7.57578e9       6.25708e9    6.3078e9     6.35871e9    6.40975e9
      -3.07648e6      3.31117e8  -8.47524e9    8.76252e10  -4.33698e11   1.03595e12      1.49876e11   1.51177e11   1.52485e11   1.53798e11

Note that the pinv() definition is using SVD internally, so this is turning into an LAPACK.gesdd() call, which is itself giving very different answers, so this should be easy to reproduce locally by passing a Hilbert matrix of the above dimensions in through whichever interface you wish to dgesdd.

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions