Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[RFC] Matrix trigonometry #23679

Merged
merged 32 commits into from
Oct 17, 2017
Merged

[RFC] Matrix trigonometry #23679

merged 32 commits into from
Oct 17, 2017

Conversation

dalum
Copy link
Contributor

@dalum dalum commented Sep 12, 2017

These are normally called cosm, sinm, tanm, etc. But since expm, logm, sqrtm were deprecated, it seems most natural to add methods to the already existing trigonometric functions.

I imagine these additions might be somewhat controversial, as this will, as with exp, result in hard-to-find bugs, when you really wanted to do automatic broadcasting. I don't know if there is a better argument against that than mathematical consistency. The matrix trigonometric functions naturally arise, as with the matrix exponential, when dealing with second order coupled linear differential equations. They also arise in code golfing. This PR would reduce Alex' answer to 6 bytes.

The implementations here are pretty naïve. I checked how scipy does it, and its implementation for cos and sin also use exp. At least here, symmetric and hermitian matrices leverage from the eigendecomposition. Higham also has a paper on a faster algorithm for cos and sincos using Padé approximants, which would probably be nice to implement, but I'm not sure if it's worth it, and it can always be done at a later point.

I also found some oddities and type instabilities in exp and log venturing into this. I have added them to this PR. I will open another PR with just those changes later when I have time.

@dalum
Copy link
Contributor Author

dalum commented Sep 13, 2017

I couldn't figure out why the tests were failing with a SingularException for inv(sinh(A2)) and inv(cosh(A2)) for elty == Float32, so I singled them out in the tests. I printed both cosh(A2) and sinh(A2) and they were clearly not singular, so something strange must be happening. This only happened when running the full suite of make test-linalg and not when manually running ./julia --check-bounds=yes --startup-file=no --depwarn=error test/linalg/dense.jl.

@dalum
Copy link
Contributor Author

dalum commented Sep 14, 2017

The tests failing were due to rounding errors in exp with BLAS.set_num_threads(1) which makes lufact fail to find a solution for matrices of smaller magnitude entries than otherwise. I have reduced the magnitude of the matrices accordingly, so now the tests should pass.

Copy link
Member

@fredrikekre fredrikekre left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe I had too much coffee... Anyway, here are some comments 😄


## Destructive matrix exponential using algorithm from Higham, 2008,
## "Functions of Matrices: Theory and Computation", SIAM
function exp!(A::StridedMatrix{T}) where T<:BlasFloat
n = checksquare(A)
if ishermitian(A)
if T <: Real
Copy link
Member

@fredrikekre fredrikekre Sep 14, 2017

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Perhaps

if T <: Real && issymmetric(A)
    # ...
elseif
    # ...
end

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The point is to avoid the check twice. If it is real, we know it is either both Hermitian and symmetric or neither.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

But issymmetric will not be called unless T <: Real, so it is completely equivalent?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, but if it is real but not symmetric, then it also isn't Hermitian, so the condition in the elseif will be false by definition and doesn't need to be checked.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Right, thanks!

Copy link
Contributor Author

@dalum dalum Sep 15, 2017

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@stevengj I think you're referring to the code snippet posted by @fredrikekre, the actual code in this PR avoids this double-check by nesting the ifs.

I don't know about why having a Symmetric is necessary (I don't know of a case where you have symmetric complex matrices), but I guess it's for semantic reasons? But then maybe Symmetric = Hermitian{<:Real} could just be defined?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Since we full here anyway I think @stevengj 's suggestion is ok. The exp methods should not differ much w.r.t. computational time. There will be some extra checking for complex parts of the diagonal in the Hermitian constructor, but should be negligible compared to the factorization and matrix multiplication:

function exp(A::Symmetric)
F = eigfact(A)
return Symmetric((F.vectors * Diagonal(exp.(F.values))) * F.vectors')
end
function exp(A::Hermitian{T}) where T
n = checksquare(A)
F = eigfact(A)
retmat = (F.vectors * Diagonal(exp.(F.values))) * F.vectors'
if T <: Real
return real(Hermitian(retmat))
else
for i = 1:n
retmat[i,i] = real(retmat[i,i])
end
return Hermitian(retmat)
end
end

Copy link
Member

@stevengj stevengj Sep 19, 2017

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You can have separate exp(A::Hermitian{<:Real}) and exp(A::Hermitian{<:Complex}) methods, so there need not be any performance penalty to using Hermitian for both cases.

I would prefer to just deprecateSymmetric entirely in favor of Hermitian{<:Real}, since Symmetric only seems to be used for the real case.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Explicitly checking T<:Real seems unnecessary to me, and explicit type-checks in Julia are usually a sign that we should be using dispatch.

We should be able to just do Hermitian!(retmat), and it should do the right thing depending on whether retmat is real or complex. See also the discussion in #17367.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@stevengj I changed the dispatch of exp and the other trigonometric functions to HermOrSym{<:Real} in the most recent commit. So now there is no waste of resources there. 😄

I asked on Slack, and there were actually some factorisation algorithms which worked for Symmetric{<:Complex}, so there seems to be some use for it. Though not here, obviously. A far more useful wrapper in my opinion would be Antihermitian, but this is probably not the right place to discuss it.

end
if ishermitian(A)
return log(Hermitian(A))
if T <: Real
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same comment here as above; but the change in order is good 👍

@@ -693,6 +698,198 @@ function inv(A::StridedMatrix{T}) where T
end

"""
cos(A::Matrix)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Perhaps change to cos(A::AbstractMatrix) alternatively cos(A::StridedMatrix)?


If `A` is symmetric or Hermitian, its eigendecomposition ([`eigfact`](@ref)) is
used to compute the cosine. Otherwise, the cosine is determined by calling
`exp`.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks like this will fit within the 92 chars on the previous line?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could also @ref to matrix exp here.

if issymmetric(A)
return full(cos(Symmetric(A)))
end
return real(exp(im*A))
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Since im*A creates a new matrix we can use exp! here.

if issymmetric(A)
return full(tanh(Symmetric(A)))
end
X, Y = exp(A), exp(-A)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Here we can change the second exp to exp!.

if ishermitian(A)
return full(tanh(Hermitian(A)))
end
X, Y = exp(A), exp(-A)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Here we can change the second exp to exp!.

hname = string(finvh)
@eval begin
@doc """
$($name)(A::Matrix)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A::AbstractMatrix

Compute the matrix $($fn) of a square matrix `A`.
""" ($finv)(A::AbstractMatrix{T}) where {T} = inv(($f)(A))
@doc """
$($hname)(A::Matrix)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A::AbstractMatrix

n = checksquare(A)
F = eigfact(A)
retmat = (F.vectors * Diagonal(exp.(F.values))) * F.vectors'
S, C = Diagonal(similar(A)), Diagonal(similar(A))
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

These could be similar(A, (n,)), otherwise they will first allocate a full matrix and then extract the diagonal. Better to simply allocate the diagonal directly.

@fredrikekre
Copy link
Member

fredrikekre commented Sep 14, 2017

Also, for cosh, sinh, tanh it seems like the methods for Real and Complex can be collapsed similar to how exp and friends are written:

function cosh(A::StridedMatrix{T}) where T
    if T <:Real && issymmetric(A)
        return full(cosh(Symmetric(A)))
    elseif ishermitian(A)
        return full(cosh(Hermitian(A)))
    end
    return (exp(A) + exp!(-A)) / 2
end

@dalum
Copy link
Contributor Author

dalum commented Sep 14, 2017

Thanks a lot for the comments! 💜

@fredrikekre
Copy link
Member

It would be great if you could also add the new docstrings to doc/src/stdlib/linalg.md. And while you are at it (if you feel like it) you can perhaps also add back (exp|log|sqrt)(::StridedMatrix), ref https://github.com/JuliaLang/julia/pull/23233/files#r138833801 😄

@dalum
Copy link
Contributor Author

dalum commented Sep 14, 2017

Done. Added inverse trigonometric functions too.

@dalum
Copy link
Contributor Author

dalum commented Sep 15, 2017

So in the latest commit I removed duplicates from stdlib/math.md and stdlib/linalg.md. I guess it is probably the right thing, but then math.md only deals with numbers. And linalg is, strictly speaking, still maths. Is this the right thing to do here?

return full.(sincos(Hermitian(A)))
end
X, Y = exp!(im*A), exp!(-im*A)
return (X - Y) / 2im, (X + Y) / 2
Copy link
Member

@stevengj stevengj Sep 19, 2017

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not sure we care, but you allocate four more matrices than necessary here: you could modify X and Y in-place to be (X-Y)/2im and (X+Y)/2 with a loop:

@inbounds for i in eachindex(X)
    x, y = X[i]/2, Y[i]/2
    X[i] = Complex(imag(x)-imag(y), real(y)-real(x))
    Y[i] = x+y
end
return X, Y

and similarly elsewhere.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also, wouldn't it be faster to do

X = exp!(im*A)
Y = inv(X)

since surely inv is faster than exp!? There might be an accuracy problem if X is ill-conditioned, but wouldn't exp! have problems too in that case?

(Similarly below.)

if ishermitian(A)
return full(cosh(Hermitian(A)))
end
return (exp(A) + exp!(-A)) / 2
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You could avoid allocations here with

Y = exp(A)
Y .= (Y .+ exp!(-A)) ./ 2

But wouldn't it be more efficient to do:

Y = exp(A)
Y .= (Y .+ inv(Y)) ./ 2

since surely inv is faster than exp!?

-0.708073 0.291927
```
"""
function acos(A::StridedMatrix{T}) where T
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why is the where T needed here? Since you don't use the T parameter, why not just acos(A::StridedMatrix)?

if ishermitian(A)
return full(acos(Hermitian(A)))
end
return -im * log(A + sqrt(A^2 - one(A)))
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A^2 - I is preferred for this kind of thing.

Note that this is suboptimal because it computes the Schur factorization twice. As pointed out by Aprahamian (2016), you can just compute the Schur factorization of A once.

if ishermitian(A)
return full(acosh(Hermitian(A)))
end
return log(A + sqrt(A^2 - one(A)))
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Aprahamian notes that this is the algorithm used in Octave and comments:

This function has two weaknesses. First, the formula used for acosh is acosh A = log(A + sqrt(A² − I)), which differs from (3.3) (cf. Lemma 3.4) and does not produce the principal branch as we have defined it. Second, the formulas are implemented as calls to logm and sqrtm and so two Schur decompositions are computed rather than one.

The efficiency issue I noted above. I'm more concerned with the fact that this produces the wrong branch cut. Are the branch cuts for your other inverse trig functions incorrect as well?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The formula they use, which produces the correct branch cut, is acosh(A) = log(A + sqrt(A-I) * sqrt(A+I)), which unfortunately requires a second matrix square root. (This formula makes it even more desirable to compute the Schur factors of A only once.)

@dalum
Copy link
Contributor Author

dalum commented Sep 20, 2017

Thanks a lot for the review and the linked article, @stevengj! ❤️ I think the inverse functions are much more solidly defined now, and I added some tests to make sure that the definitions of their principal values are met.

SchurF = schurfact(complex(A))
U = im * UpperTriangular(SchurF.T)
R = full(log((I - U) \ (I + U)) / 2im)
return SchurF.Z * R * SchurF.Z'
end
Copy link
Member

@stevengj stevengj Sep 20, 2017

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In principle, we could save a few matrix allocations here. But the code would get a lot uglier and I doubt it's worth the trouble — practical uses of matrix inverse trig functions are rare, and I doubt most users will care about saving a bit of memory here. They can always be further optimized later on if the need arises.

-0.708073 0.291927
```
"""
function sincos(A::StridedMatrix{<:Real})
Copy link
Member

@stevengj stevengj Sep 20, 2017

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why is the signature here (and elsewhere) StridedMatrix? It seems like it should be just AbstractMatrix.

(It may be that non-StridedMatrix types will throw a method error on exp! or something, but that's a limitation that exists elsewhere, and may someday be lifted, not in these routines themselves.)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, you are right. The reason that I chose StridedMatrix was because I thought it would be a strange error that log or sqrt isn't defined, but it is of course a lot more useful than just not having the inverse trig functions defined. Will change it.

@stevengj
Copy link
Member

It would be good to add a comment somewhere referencing the Aprahamian and Higham paper in the inverse-trig code if you haven't done so. (You could even add a link to the docstrings: For the theory and logarithmic formulas used to compute this function, see [Aprahamian and Higham (2016)](https://doi.org/10.1137/16M1057577)..)

@stevengj
Copy link
Member

stevengj commented Sep 20, 2017

Now that I think about it, my suggestion to use inv(exp(A)) instead of exp(-A) seems like it was not good. It looks like the latter is indeed much more accurate if exp(A) is ill-conditioned. So, I'm afraid that we should probably call exp twice.

julia> S = [2 5; 1 8]; Λ = diagm([20,-15]); A = S * Λ / S
2×2 Array{Float64,2}:
 35.9091  -31.8182
 25.4545  -30.9091


julia> expm(A)
2×2 Array{Float64,2}:
 7.05695e8  -4.41059e8
 3.52847e8  -2.2053e8 

julia> S * exp.(Λ) / S
2×2 Array{Float64,2}:
 7.05695e8  -4.41059e8
 3.52847e8  -2.2053e8 

julia> Float64.(big.(S) * exp.(big.(Λ)) / big.(S))
2×2 Array{Float64,2}:
 7.05695e8  -4.41059e8
 3.52847e8  -2.2053e8 

julia> inv(expm(-A))
2×2 Array{Float64,2}:
 6.13567e8  -3.83479e8
 3.06783e8  -1.9174e8 


julia> expm(-A)
2×2 Array{Float64,2}:
 -1.48592e6  2.97183e6
 -2.37747e6  4.75493e6

julia> S * exp.(-Λ) / S
2×2 Array{Float64,2}:
 -1.48591e6  2.97183e6
 -2.37746e6  4.75493e6

julia> Float64.(big.(S) * exp.(-big.(Λ)) / big.(S))
2×2 Array{Float64,2}:
 -1.48591e6  2.97183e6
 -2.37746e6  4.75493e6

julia> inv(expm(A))
2×2 Array{Float64,2}:
 -1.74763e6  3.49525e6
 -2.7962e6   5.59241e6

# @test all(z -> (-π < imag(z) < π && real(z) > 0 ||
# 0 <= imag(z) < π && real(z) ≈ 0 ||
# imag(z) ≈ π && real(z) >= 0),
# eigfact(acosh(A))[:values])
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What happened here?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

CI fails, I don't know why. It passes locally. Would really like some help figuring out what is going on.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I thought maybe I had to rebase, but when I run the tests locally, they still pass.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

≈ 0 is not going to work, because tests relative error (there is no way to set a sensible default absolute tolerance). You need to test abs(real(z)) < abstol for some reasonable abstol for this matrix, e.g. 1e-10 * norm(acosh(A)).

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Alright, gonna try that now. 🙂

@dalum
Copy link
Contributor Author

dalum commented Sep 20, 2017

I still can't figure out why CI failed. I added the test back. Would like to see if it fails on other platforms as well, or just CircleCI, though Travis will probably never complete ...

X = exp(A)
X .= (X .+ inv(X)) ./ 2
return X
return (exp(A) .+ exp!(-A)) ./ 2
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You should still do X = exp(A) and X .= ..., to avoid allocating an extra matrix for the result.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Right, thanks!

X = exp(A)
X .= (X .- inv(X)) ./ 2
return X
return (exp(A) .- exp!(-A)) ./ 2
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

See above.

@dalum
Copy link
Contributor Author

dalum commented Oct 16, 2017

Bumpitidy, rebase finished, looks like it builds.

@stevengj
Copy link
Member

Travis failure is #17626, which is unrelated. LGTM.

@andreasnoack andreasnoack merged commit 361ceb0 into JuliaLang:master Oct 17, 2017
@dalum dalum deleted the evey/matrig branch October 17, 2017 08:58
Sacha0 added a commit to Sacha0/julia that referenced this pull request Oct 17, 2017
Sacha0 added a commit to Sacha0/julia that referenced this pull request Oct 17, 2017
Sacha0 added a commit to Sacha0/julia that referenced this pull request Oct 17, 2017
@vtjnash
Copy link
Member

vtjnash commented Nov 6, 2017

I'm seeing frequent failures in the linux32 buildog tests on this merge commit and following (starting at https://buildog.julialang.org/#/builders/41/builds/233). Could you look into it and see if it is possibly related?

@dalum
Copy link
Contributor Author

dalum commented Nov 10, 2017

@vtjnash It might be related to the abstol parameter in the tests. I thought it was fixed by increasing it from 1e-10 to 1e-8 times the norm. Could it be that something in LAPACK is a little non-deterministic there? I don't have a 32-bit machine, so I can't test it, but I guess it might be fixed by increasing the abstol scaling to 1e-7.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
linear algebra Linear algebra
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants