-
-
Notifications
You must be signed in to change notification settings - Fork 5.5k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[RFC] Matrix trigonometry #23679
[RFC] Matrix trigonometry #23679
Conversation
I couldn't figure out why the tests were failing with a |
The tests failing were due to rounding errors in |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe I had too much coffee... Anyway, here are some comments 😄
base/linalg/dense.jl
Outdated
|
||
## Destructive matrix exponential using algorithm from Higham, 2008, | ||
## "Functions of Matrices: Theory and Computation", SIAM | ||
function exp!(A::StridedMatrix{T}) where T<:BlasFloat | ||
n = checksquare(A) | ||
if ishermitian(A) | ||
if T <: Real |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Perhaps
if T <: Real && issymmetric(A)
# ...
elseif
# ...
end
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The point is to avoid the check twice. If it is real, we know it is either both Hermitian and symmetric or neither.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
But issymmetric
will not be called unless T <: Real
, so it is completely equivalent?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, but if it is real but not symmetric, then it also isn't Hermitian, so the condition in the elseif
will be false by definition and doesn't need to be checked.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Right, thanks!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@stevengj I think you're referring to the code snippet posted by @fredrikekre, the actual code in this PR avoids this double-check by nesting the if
s.
I don't know about why having a Symmetric
is necessary (I don't know of a case where you have symmetric complex matrices), but I guess it's for semantic reasons? But then maybe Symmetric = Hermitian{<:Real}
could just be defined?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Since we full
here anyway I think @stevengj 's suggestion is ok. The exp
methods should not differ much w.r.t. computational time. There will be some extra checking for complex parts of the diagonal in the Hermitian
constructor, but should be negligible compared to the factorization and matrix multiplication:
julia/base/linalg/symmetric.jl
Lines 592 to 608 in b9aed45
function exp(A::Symmetric) | |
F = eigfact(A) | |
return Symmetric((F.vectors * Diagonal(exp.(F.values))) * F.vectors') | |
end | |
function exp(A::Hermitian{T}) where T | |
n = checksquare(A) | |
F = eigfact(A) | |
retmat = (F.vectors * Diagonal(exp.(F.values))) * F.vectors' | |
if T <: Real | |
return real(Hermitian(retmat)) | |
else | |
for i = 1:n | |
retmat[i,i] = real(retmat[i,i]) | |
end | |
return Hermitian(retmat) | |
end | |
end |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You can have separate exp(A::Hermitian{<:Real})
and exp(A::Hermitian{<:Complex})
methods, so there need not be any performance penalty to using Hermitian
for both cases.
I would prefer to just deprecateSymmetric
entirely in favor of Hermitian{<:Real}
, since Symmetric only seems to be used for the real case.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Explicitly checking T<:Real
seems unnecessary to me, and explicit type-checks in Julia are usually a sign that we should be using dispatch.
We should be able to just do Hermitian!(retmat)
, and it should do the right thing depending on whether retmat
is real or complex. See also the discussion in #17367.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@stevengj I changed the dispatch of exp
and the other trigonometric functions to HermOrSym{<:Real}
in the most recent commit. So now there is no waste of resources there. 😄
I asked on Slack, and there were actually some factorisation algorithms which worked for Symmetric{<:Complex}
, so there seems to be some use for it. Though not here, obviously. A far more useful wrapper in my opinion would be Antihermitian
, but this is probably not the right place to discuss it.
base/linalg/dense.jl
Outdated
end | ||
if ishermitian(A) | ||
return log(Hermitian(A)) | ||
if T <: Real |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Same comment here as above; but the change in order is good 👍
base/linalg/dense.jl
Outdated
@@ -693,6 +698,198 @@ function inv(A::StridedMatrix{T}) where T | |||
end | |||
|
|||
""" | |||
cos(A::Matrix) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Perhaps change to cos(A::AbstractMatrix)
alternatively cos(A::StridedMatrix)
?
base/linalg/dense.jl
Outdated
|
||
If `A` is symmetric or Hermitian, its eigendecomposition ([`eigfact`](@ref)) is | ||
used to compute the cosine. Otherwise, the cosine is determined by calling | ||
`exp`. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks like this will fit within the 92 chars on the previous line?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could also @ref
to matrix exp
here.
base/linalg/dense.jl
Outdated
if issymmetric(A) | ||
return full(cos(Symmetric(A))) | ||
end | ||
return real(exp(im*A)) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Since im*A
creates a new matrix we can use exp!
here.
base/linalg/dense.jl
Outdated
if issymmetric(A) | ||
return full(tanh(Symmetric(A))) | ||
end | ||
X, Y = exp(A), exp(-A) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Here we can change the second exp
to exp!
.
base/linalg/dense.jl
Outdated
if ishermitian(A) | ||
return full(tanh(Hermitian(A))) | ||
end | ||
X, Y = exp(A), exp(-A) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Here we can change the second exp
to exp!
.
base/linalg/dense.jl
Outdated
hname = string(finvh) | ||
@eval begin | ||
@doc """ | ||
$($name)(A::Matrix) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
A::AbstractMatrix
base/linalg/dense.jl
Outdated
Compute the matrix $($fn) of a square matrix `A`. | ||
""" ($finv)(A::AbstractMatrix{T}) where {T} = inv(($f)(A)) | ||
@doc """ | ||
$($hname)(A::Matrix) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
A::AbstractMatrix
base/linalg/symmetric.jl
Outdated
n = checksquare(A) | ||
F = eigfact(A) | ||
retmat = (F.vectors * Diagonal(exp.(F.values))) * F.vectors' | ||
S, C = Diagonal(similar(A)), Diagonal(similar(A)) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
These could be similar(A, (n,))
, otherwise they will first allocate a full matrix and then extract the diagonal. Better to simply allocate the diagonal directly.
Also, for function cosh(A::StridedMatrix{T}) where T
if T <:Real && issymmetric(A)
return full(cosh(Symmetric(A)))
elseif ishermitian(A)
return full(cosh(Hermitian(A)))
end
return (exp(A) + exp!(-A)) / 2
end |
Thanks a lot for the comments! 💜 |
It would be great if you could also add the new docstrings to |
Done. Added inverse trigonometric functions too. |
So in the latest commit I removed duplicates from |
base/linalg/dense.jl
Outdated
return full.(sincos(Hermitian(A))) | ||
end | ||
X, Y = exp!(im*A), exp!(-im*A) | ||
return (X - Y) / 2im, (X + Y) / 2 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not sure we care, but you allocate four more matrices than necessary here: you could modify X
and Y
in-place to be (X-Y)/2im
and (X+Y)/2
with a loop:
@inbounds for i in eachindex(X)
x, y = X[i]/2, Y[i]/2
X[i] = Complex(imag(x)-imag(y), real(y)-real(x))
Y[i] = x+y
end
return X, Y
and similarly elsewhere.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Also, wouldn't it be faster to do
X = exp!(im*A)
Y = inv(X)
since surely inv
is faster than exp!
? There might be an accuracy problem if X
is ill-conditioned, but wouldn't exp!
have problems too in that case?
(Similarly below.)
base/linalg/dense.jl
Outdated
if ishermitian(A) | ||
return full(cosh(Hermitian(A))) | ||
end | ||
return (exp(A) + exp!(-A)) / 2 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You could avoid allocations here with
Y = exp(A)
Y .= (Y .+ exp!(-A)) ./ 2
But wouldn't it be more efficient to do:
Y = exp(A)
Y .= (Y .+ inv(Y)) ./ 2
since surely inv
is faster than exp!
?
base/linalg/dense.jl
Outdated
-0.708073 0.291927 | ||
``` | ||
""" | ||
function acos(A::StridedMatrix{T}) where T |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
why is the where T
needed here? Since you don't use the T
parameter, why not just acos(A::StridedMatrix)
?
base/linalg/dense.jl
Outdated
if ishermitian(A) | ||
return full(acos(Hermitian(A))) | ||
end | ||
return -im * log(A + sqrt(A^2 - one(A))) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
A^2 - I
is preferred for this kind of thing.
Note that this is suboptimal because it computes the Schur factorization twice. As pointed out by Aprahamian (2016), you can just compute the Schur factorization of A
once.
base/linalg/dense.jl
Outdated
if ishermitian(A) | ||
return full(acosh(Hermitian(A))) | ||
end | ||
return log(A + sqrt(A^2 - one(A))) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Aprahamian notes that this is the algorithm used in Octave and comments:
This function has two weaknesses. First, the formula used for acosh is
acosh A = log(A + sqrt(A² − I))
, which differs from (3.3) (cf. Lemma 3.4) and does not produce the principal branch as we have defined it. Second, the formulas are implemented as calls to logm and sqrtm and so two Schur decompositions are computed rather than one.
The efficiency issue I noted above. I'm more concerned with the fact that this produces the wrong branch cut. Are the branch cuts for your other inverse trig functions incorrect as well?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The formula they use, which produces the correct branch cut, is acosh(A) = log(A + sqrt(A-I) * sqrt(A+I))
, which unfortunately requires a second matrix square root. (This formula makes it even more desirable to compute the Schur factors of A only once.)
Thanks a lot for the review and the linked article, @stevengj! ❤️ I think the inverse functions are much more solidly defined now, and I added some tests to make sure that the definitions of their principal values are met. |
base/linalg/dense.jl
Outdated
SchurF = schurfact(complex(A)) | ||
U = im * UpperTriangular(SchurF.T) | ||
R = full(log((I - U) \ (I + U)) / 2im) | ||
return SchurF.Z * R * SchurF.Z' | ||
end |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In principle, we could save a few matrix allocations here. But the code would get a lot uglier and I doubt it's worth the trouble — practical uses of matrix inverse trig functions are rare, and I doubt most users will care about saving a bit of memory here. They can always be further optimized later on if the need arises.
base/linalg/dense.jl
Outdated
-0.708073 0.291927 | ||
``` | ||
""" | ||
function sincos(A::StridedMatrix{<:Real}) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why is the signature here (and elsewhere) StridedMatrix
? It seems like it should be just AbstractMatrix
.
(It may be that non-StridedMatrix types will throw a method error on exp!
or something, but that's a limitation that exists elsewhere, and may someday be lifted, not in these routines themselves.)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, you are right. The reason that I chose StridedMatrix
was because I thought it would be a strange error that log
or sqrt
isn't defined, but it is of course a lot more useful than just not having the inverse trig functions defined. Will change it.
It would be good to add a comment somewhere referencing the Aprahamian and Higham paper in the inverse-trig code if you haven't done so. (You could even add a link to the docstrings: |
Now that I think about it, my suggestion to use julia> S = [2 5; 1 8]; Λ = diagm([20,-15]); A = S * Λ / S
2×2 Array{Float64,2}:
35.9091 -31.8182
25.4545 -30.9091
julia> expm(A)
2×2 Array{Float64,2}:
7.05695e8 -4.41059e8
3.52847e8 -2.2053e8
julia> S * exp.(Λ) / S
2×2 Array{Float64,2}:
7.05695e8 -4.41059e8
3.52847e8 -2.2053e8
julia> Float64.(big.(S) * exp.(big.(Λ)) / big.(S))
2×2 Array{Float64,2}:
7.05695e8 -4.41059e8
3.52847e8 -2.2053e8
julia> inv(expm(-A))
2×2 Array{Float64,2}:
6.13567e8 -3.83479e8
3.06783e8 -1.9174e8
julia> expm(-A)
2×2 Array{Float64,2}:
-1.48592e6 2.97183e6
-2.37747e6 4.75493e6
julia> S * exp.(-Λ) / S
2×2 Array{Float64,2}:
-1.48591e6 2.97183e6
-2.37746e6 4.75493e6
julia> Float64.(big.(S) * exp.(-big.(Λ)) / big.(S))
2×2 Array{Float64,2}:
-1.48591e6 2.97183e6
-2.37746e6 4.75493e6
julia> inv(expm(A))
2×2 Array{Float64,2}:
-1.74763e6 3.49525e6
-2.7962e6 5.59241e6
|
test/linalg/dense.jl
Outdated
# @test all(z -> (-π < imag(z) < π && real(z) > 0 || | ||
# 0 <= imag(z) < π && real(z) ≈ 0 || | ||
# imag(z) ≈ π && real(z) >= 0), | ||
# eigfact(acosh(A))[:values]) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What happened here?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
CI fails, I don't know why. It passes locally. Would really like some help figuring out what is going on.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I thought maybe I had to rebase, but when I run the tests locally, they still pass.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
≈ 0
is not going to work, because ≈
tests relative error (there is no way to set a sensible default absolute tolerance). You need to test abs(real(z)) < abstol
for some reasonable abstol
for this matrix, e.g. 1e-10 * norm(acosh(A))
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Alright, gonna try that now. 🙂
I still can't figure out why CI failed. I added the test back. Would like to see if it fails on other platforms as well, or just CircleCI, though Travis will probably never complete ... |
base/linalg/dense.jl
Outdated
X = exp(A) | ||
X .= (X .+ inv(X)) ./ 2 | ||
return X | ||
return (exp(A) .+ exp!(-A)) ./ 2 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You should still do X = exp(A)
and X .= ...
, to avoid allocating an extra matrix for the result.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Right, thanks!
base/linalg/dense.jl
Outdated
X = exp(A) | ||
X .= (X .- inv(X)) ./ 2 | ||
return X | ||
return (exp(A) .- exp!(-A)) ./ 2 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
See above.
Bumpitidy, rebase finished, looks like it builds. |
Travis failure is #17626, which is unrelated. LGTM. |
I'm seeing frequent failures in the linux32 buildog tests on this merge commit and following (starting at https://buildog.julialang.org/#/builders/41/builds/233). Could you look into it and see if it is possibly related? |
@vtjnash It might be related to the |
These are normally called
cosm
,sinm
,tanm
, etc. But sinceexpm
,logm
,sqrtm
were deprecated, it seems most natural to add methods to the already existing trigonometric functions.I imagine these additions might be somewhat controversial, as this will, as with
exp
, result in hard-to-find bugs, when you really wanted to do automatic broadcasting. I don't know if there is a better argument against that than mathematical consistency. The matrix trigonometric functions naturally arise, as with the matrix exponential, when dealing with second order coupled linear differential equations. They also arise in code golfing. This PR would reduce Alex' answer to 6 bytes.The implementations here are pretty naïve. I checked how scipy does it, and its implementation for
cos
andsin
also useexp
. At least here, symmetric and hermitian matrices leverage from the eigendecomposition. Higham also has a paper on a faster algorithm forcos
andsincos
using Padé approximants, which would probably be nice to implement, but I'm not sure if it's worth it, and it can always be done at a later point.I also found some oddities and type instabilities in
exp
andlog
venturing into this. I have added them to this PR. I will open another PR with just those changes later when I have time.