-
-
Notifications
You must be signed in to change notification settings - Fork 5.5k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
use lufact in inv of Symmetric/Hermitian matrices #22686
Conversation
how much is spent in factorize vs backsolve with this? |
Updated the table with percentage in factorization. It is definitely the backsolve that is slower for the Bunch Kaufman approach. Probably related to #22561 (comment)? |
I also wonder if there might be any bias here in deviations from symmetry when lufact is used, especially for ill-conditioned inputs. Maybe not enough to worry about since you shouldn't usually use inv anyway. |
Updated to use the inplace |
Tested on my computer at home, which has better prestanda, here are the ratios:
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Interesting! :)
Shall we merge this then? The performance improvements look consistent and substantial. |
Yes, please! 😄 |
There is a funny little story about numerical analysis hidden here. I happen to have spent some time on this because of a small post to NA Digest in 2015. Suddenly Matlab's julia> H = [1/(i + j - 1) for i in 1:8, j in 1:8];
julia> issymmetric(H)
true
julia> issymmetric(inv(lufact(H)))
false This was the case in older versions of Matlab. It is pretty annoying that the inverse of a symmetric matrix is not symmetric and that was probably what made them change the behavior around R2013a. Suddenly, the matrix was symmetric. The way they did it was to use LU and then copy the elements from one triangle to another. That would effectively be the same as this PR. The funny thing is that julia> norm(inv(H)*(H*ones(8)) - 1)
8.704438947487126e-7 is fine but julia> norm(Symmetric(inv(lufact(H)))*(H*ones(8)) - 1)
0.2623294451513258 has a huge error. So the rounding errors in the LU based inverse somehow cancel accross the lower and upper triangle when solving. Also, this doesn't happen for Bunch-Kaufman that only works on a single triangle julia> norm(inv(bkfact(H))*(H*ones(8)) - 1)
7.277205904268713e-7 Matlab changed the behavior again in R2016a but without mentioning anything in their release notes. A weird thing is that from and after 2016b I get different behavior on my Mac and on a Linux system. The Linux version is symmetric but the Mac inverse isn't. They both have small error now. Bottomline is that we shouldn't merge this as it is. |
We could continue to use Bunch-Kaufman but since LU is so much faster it is probably better to figure out if symmetrizing with the average julia> inv(H) |> t -> Symmetric(t + t')/2*(H*ones(8)) - 1 |> norm
1.5769221751231655e-6 |
Oh, thanks for catching that. This only applies to |
Yes. That is fine. The problem is only when copying elements from one triangle to the other in the unsymmetric LU based inverse. We might actually want to add the Hilbert matrix example as a test case to make sure we don't accidentally make the mistake in the future. |
Okay.
Yea, planned to do that. For the in-place symmetrization, there is no such thing function implemented, right? |
Not to my knowledge |
As an aside, I would like to point out how lovely it is that you can so easily test out the behaviors of different factorization approaches with code as simple and clear as this: norm(inv(H)*(H*ones(8)) - 1)
norm(Symmetric(inv(lufact(H)))*(H*ones(8)) - 1)
norm(inv(bkfact(H))*(H*ones(8)) - 1) |
Okay, I have an updated version now. Benchmarks:
and the allocations are still half for this PR. |
86f16d4
to
8376cbb
Compare
base/linalg/symmetric.jl
Outdated
# symmetrize | ||
if A.uplo == 'U' # add to upper triangle | ||
@inbounds for i = 1:(n-1), j = (i+1):n | ||
B[i,j] = conjugate ? (B[i,j] + conj(B[j,i])) / 2 : (B[i,j] + B[j,i]) / 2 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Having a branch here might be expensive so I think it might be worthwhile to split the method. Maybe as an @eval
ed loop.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It feels LLVM should be able to do something like that by itself by noticing that conjugate
is constant in the loop. Probably worth to benchmark before changing the code to a more hard to read version.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I will test it, I just copied the structure from copytri!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Benchmarking (and some amateuristic inspection of generated code) says this is not a problem.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
lgtm modulo the splitting @andreasnoack
suggested! :)
inv(A::Symmetric{T,S}) where {T<:BlasFloat,S<:StridedMatrix} = Symmetric{T,S}(inv(bkfact(A)), A.uplo) | ||
function _inv(A::HermOrSym) | ||
n = checksquare(A) | ||
B = inv!(lufact(A)) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
do all lufact
return types always support in-place inv!
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, this will only be called for StridedMatrix
and BlasFloat
, so presumably this should be OK?
base/linalg/symmetric.jl
Outdated
conjugate = isa(A, Hermitian) | ||
# symmetrize | ||
if A.uplo == 'U' # add to upper triangle | ||
@inbounds for i = 1:(n-1), j = (i+1):n |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Would inverting the LU factorization ever make a zero imaginary component non-zero? If so, should probably include the diagonal for Hermitian
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Might as well always include the diagonal, its not much more work.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Would inverting the LU factorization ever make a zero imaginary component non-zero?
Seems like it actually (sometimes):
julia> complex.(rand(2,2), rand(2,2)) |> t -> (t + t') |> t -> Hermitian(inv(lufact(t)))
ERROR: ArgumentError: Cannot construct Hermitian from matrix with nonreal diagonals
Stacktrace:
[...]
Will include the diagonal and add a test.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done.
end | ||
B | ||
end | ||
inv(A::Hermitian{T,S}) where {T<:BlasFloat,S<:StridedMatrix} = Hermitian{T,S}(_inv(A), A.uplo) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This was here before this PR, but the construction of the Hermitian
here elides the check for complex diagonal elements. Should that check be moved to an inner constructor? Or, alternatively, just call Hermitian
here without type arguments, and rely on the outer constructor. Currently:
julia> H
2×2 Hermitian{Complex{Float64},Array{Complex{Float64},2}}:
0.650488-0.0im 0.826686+0.667447im
0.826686-0.667447im 1.81707-0.0im
julia> iH = inv(H)
2×2 Hermitian{Complex{Float64},Array{Complex{Float64},2}}:
34.2282+2.13163e-14im -15.5723-12.5727im
-15.5723+12.5727im 12.2532+7.10543e-15im
julia> isreal(iH[1,1])
false
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why hasn't the imaginary part of the diagonal been zeroed by the symmetrization? Eventually, the check should just go away completely, see #17367
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That example was for the original version, which did not loop to the diagonal, like in copytri!
. See #22686 (comment)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ah, got the chronology of the messages wrong. Then I guess this line is fine, right?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes.
8376cbb
to
6f2b466
Compare
AV x86 failed with:
presumably unrelated? |
Noticed in #22827 that this enables |
Seems to be generally faster; with the following
I get:
bkfact
)lufact
)bkfact
)lufact
)bkfact
)lufact
)bkfact
)lufact
)bkfact
)lufact
)bkfact
)lufact
)bkfact
)lufact
)bkfact
)lufact
)bkfact
)lufact
)bkfact
)lufact
)Allocations are the same for both methods.
Would be nice if someone could confirm this on some other computer.