Skip to content

[WIP] README updates for 0.7 #467

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 2 commits into from
Sep 27, 2019
Merged

[WIP] README updates for 0.7 #467

merged 2 commits into from
Sep 27, 2019

Conversation

schmrlng
Copy link
Contributor

Since this affects the github front page of StaticArrays.jl these changes should probably be merged only once 0.7 is actually released.

Regarding badges - do we want to keep the 0.5 and 0.6 passing badges up indefinitely?

@andyferris
Copy link
Member

Thanks. Not sure on when is best timing but we can merge this before JuliaCon.

There's still a v0.6 version that will be in use for a while - we may even fix a bug or two - so I'd like to keep that around along with a v0.7 (or v1.0, soon I hope :) ) badge

@andyferris
Copy link
Member

Out of curiousity, where is this at?

@schmrlng
Copy link
Contributor Author

schmrlng commented Aug 4, 2018

I've just added a few more changes -- dropping the v0.5 badge, adding a v0.7 badge (which actually doesn't seem to be working yet: https://pkg.julialang.org/badges/StaticArrays_0.7.svg), and removing the outdated discussion on ImmutableArrays and FixedSizeArrays. I can squash the two commits if necessary. The "Speed" section still needs to be updated (I'm not sure which benchmark script was used to generate those results) and the reference therein to julia 0.5 removed.

If the plan is to maintain an 0.6 branch (the closest thing at the moment is the v0.7.2 tag), maybe we should do something along the lines of the message at the top of the README here: https://github.com/timholy/Revise.jl.

@andyferris
Copy link
Member

We should fix this up for v1.0

@KristofferC
Copy link
Contributor

KristofferC commented Sep 6, 2018

Note that SLP is on by default so any reference to -O3 is likely not needed anymore.

@andyferris
Copy link
Member

Yes, there is a bit of out-of-date cruft in there.

@schmrlng
Copy link
Contributor Author

schmrlng commented Sep 6, 2018

Currently this PR has 8 badges at the top. These are listed below with a proposal for what to do with each in parentheses. Thoughts/comments (especially on the dual code coverage badges) are welcome.

  • pkg.julialang.org 0.6 badge (remove)
  • pkg.julialang.org 0.7 badge, broken (remove for now)
  • Travis badge (keep)
  • Appveyor badge (keep)
  • Coveralls badge
  • CodeCov badge (keep only one of this or previous?)
  • Latest docs (keep)
  • Stable docs (keep)

Other notes:

  • I've made a new README_benchmarks.jl script for the README microbenchmarks.
  • I removed the line "For example, the performance crossover point for a matrix multiply microbenchmark seems to be about 11x11 in julia 0.5 with default optimizations." in the Speed subsection. I still think some statement to that effect would be nice, but on my computer (Ryzen 1800X, 8C16T) the crossover point is now at 23x23 so I'd like to see if other people get similar results:
for N in 1:100
    A = rand(N,N)
    SA = SMatrix{N,N}(A)
    speedup = (@belapsed $A*$A)/(@belapsed $SA*$SA)
    println("$(N)x$(N): $speedup")
    speedup < 1 && break
end
  • I'm not certain that this line in Quick start is still true:
rand(MMatrix{20,20}) * rand(MMatrix{20,20}) # large matrices can use BLAS

@andyferris
Copy link
Member

These are listed below with a proposal for what to do with each in parentheses.

I like your suggestions here!

@andyferris
Copy link
Member

andyferris commented Sep 6, 2018

Re. the "speed crossover" thing, it's important that people are a bit scared of large static matrices... compile times can be aweful and the amount of generated code means that practical workfloads will suffer compared to microbenchmarks due to instruction cache being tiny on CPUs.

So those new benchmarks are Julia 1.0 on a Ryzen 1800X? I'd be interested to see what results we get from a modern Intel (for curiousity - I don't think the README necessarily needs to be "Intel Inside")

@KristofferC
Copy link
Contributor

FWIW, I don't think two documenter badges are needed. If you want the dev docs you can always use the dropdown menu.

@fredrikekre
Copy link
Member

FWIW, I don't think two documenter badges are needed. If you want the dev docs you can always use the dropdown menu.

Especially since latest and stable are pretty ambiguous. Why would you not choose the latest docs? It is not "scary" enough, and not clear enough that latest mean development version.

Here I tried something new: https://github.com/fredrikekre/Literate.jl/blob/master/README.md and put stable release instead.

@schmrlng
Copy link
Contributor Author

schmrlng commented Sep 8, 2018

Looks good, on the next update I'll essentially just ape what you've got going on at the Literate.jl README.

Unfortunately the only modern processor I have access to is my work machine (the Ryzen 1800X); I have a 2011 Macbook Pro with an Intel quad-core but I wouldn't exactly call that modern (it's definitely missing some newer SIMD stuff). If anyone reading this has a better option at hand, can you run julia README_benchmarks.jl from this PR or equivalently paste this code into a julia (1.0) REPL and post the output?

using BenchmarkTools
using LinearAlgebra
using StaticArrays

add!(C, A, B) = (C .= A .+ B)

function simple_bench(N, T=Float64)
    A = rand(T,N,N)
    A = A'*A
    B = copy(A)
    SA = SMatrix{N,N}(A)
    MA = MMatrix{N,N}(A)
    MB = copy(MA)

    print("""
============================================
    Benchmarks for $N×$N $T matrices
============================================
""")
    ops = [
        ("Matrix multiplication              ", *, (A, A), (SA, SA)),
        ("Matrix multiplication (mutating)   ", mul!, (B, A, A), (MB, MA, MA)),
        ("Matrix addition                    ", +, (A, A), (SA, SA)),
        ("Matrix addition (mutating)         ", add!, (B, A, A), (MB, MA, MA)),
        ("Matrix determinant                 ", det, A, SA),
        ("Matrix inverse                     ", inv, A, SA),
        ("Matrix symmetric eigendecomposition", eigen, A, SA),
        ("Matrix Cholesky decomposition      ", cholesky, A, SA)
    ]
    for (name, op, Aargs, SAargs) in ops
        if Aargs isa Tuple && length(Aargs) == 2
            speedup = @belapsed($op($Aargs[1], $Aargs[2])) / @belapsed($op($SAargs[1], $SAargs[2]))
        elseif Aargs isa Tuple && length(Aargs) == 3
            speedup = @belapsed($op($Aargs[1], $Aargs[2], $Aargs[3])) / @belapsed($op($SAargs[1], $SAargs[2], $SAargs[3]))
        else
            speedup = @belapsed($op($Aargs)) / @belapsed($op($SAargs))
        end
        println(name*" -> $(round(speedup, digits=1))x speedup")
    end
end

simple_bench(3)

@andyferris
Copy link
Member

If I run your code on a i7-8650U Skylake I get

============================================
    Benchmarks for 3×3 Float64 matrices
============================================
Matrix multiplication               -> 5.8x speedup
Matrix multiplication (mutating)    -> 1.6x speedup
Matrix addition                     -> 24.4x speedup
Matrix addition (mutating)          -> 2.5x speedup
Matrix determinant                  -> 81.1x speedup
Matrix inverse                      -> 57.8x speedup
Matrix symmetric eigendecomposition -> 20.4x speedup
Matrix Cholesky decomposition       -> 8.1x speedup

However, I just noted you are using BenchmarkTools.jl here. I do love BenchmarkTools, but in the past I was getting problems with it somehow doing less allocations for Array than could be reasonably expected, so I had some different benchmark code... let me re-run that too.

@c42f c42f added the documentation documentation improvements label Aug 1, 2019
@c42f c42f mentioned this pull request Aug 1, 2019
18 tasks
@c42f c42f added this to the 1.0 milestone Aug 1, 2019
schmrlng and others added 2 commits September 20, 2019 16:52
Julia 1.2 will optimize some of these benchmarks away, so use
dereferencing of Ref's to at least prevent that particular error.

Update the README a bit with these results.
@c42f
Copy link
Member

c42f commented Sep 20, 2019

This has been quietly bit-rotting for too long, so I took the liberty of rebasing and addressing Andy's comments — at least I hope — by using Refs to prevent unfair optimizations. (Julia 1.2 certainly optimizes out some of the test cases in the previous version of the benchmark script.)

As with all microbenchmarks, the results are only somewhat indicative. So instead of really working hard on those right now I've just added a caveat in the documentation. The key point is that people should try StaticArrays if they care about performance for small matrices, not to claim that we know exactly how much faster it may be in a given application (which is impossible).

@KristofferC
Copy link
Contributor

Tangentially related, there are a large number of 3 years old .txt files in https://github.com/JuliaArrays/StaticArrays.jl/tree/master/perf. Should these just be removed?

@c42f
Copy link
Member

c42f commented Sep 20, 2019

Quite possible, @andyferris would know if anyone does.

Benchmark cleanup is probably a separate PR though. Ideally we'd do it systematically and regularly on a central machine but I'm skeptical that CI workers would work for that. Is there a way to integrate with the Base benchmarking infrastructure for widely used performance critical packages like StaticArrays?

@coveralls
Copy link

Coverage Status

Coverage remained the same at 81.583% when pulling a7e9cba on schmrlng:README_0.7 into 935dc85 on JuliaArrays:master.

@c42f c42f merged commit 4c8e95f into JuliaArrays:master Sep 27, 2019
@schmrlng
Copy link
Contributor Author

Thanks for getting this across the finish line!

@c42f
Copy link
Member

c42f commented Sep 27, 2019

No problem, sorry it took such a long time!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
documentation documentation improvements
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants