Skip to content

[WIP] README updates for 0.7 #467

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 2 commits into from
Sep 27, 2019
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
65 changes: 24 additions & 41 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -5,12 +5,12 @@
[![Build Status](https://travis-ci.org/JuliaArrays/StaticArrays.jl.svg?branch=master)](https://travis-ci.org/JuliaArrays/StaticArrays.jl)
[![Build status](https://ci.appveyor.com/api/projects/status/xabgh1yhsjxlp30d?svg=true)](https://ci.appveyor.com/project/JuliaArrays/staticarrays-jl)
[![Coverage Status](https://coveralls.io/repos/github/JuliaArrays/StaticArrays.jl/badge.svg?branch=master)](https://coveralls.io/github/JuliaArrays/StaticArrays.jl?branch=master)
[![codecov.io](http://codecov.io/github/JuliaArrays/StaticArrays.jl/coverage.svg?branch=master)](http://codecov.io/github/JuliaArrays/StaticArrays.jl?branch=master)
[![codecov.io](https://codecov.io/github/JuliaArrays/StaticArrays.jl/branch/master/graph/badge.svg)](http://codecov.io/github/JuliaArrays/StaticArrays.jl/branch/master)
[![](https://img.shields.io/badge/docs-latest-blue.svg)](https://JuliaArrays.github.io/StaticArrays.jl/latest)
[![](https://img.shields.io/badge/docs-stable-blue.svg)](https://JuliaArrays.github.io/StaticArrays.jl/stable)

**StaticArrays** provides a framework for implementing statically sized arrays
in Julia (≥ 0.5), using the abstract type `StaticArray{Size,T,N} <: AbstractArray{T,N}`.
in Julia, using the abstract type `StaticArray{Size,T,N} <: AbstractArray{T,N}`.
Subtypes of `StaticArray` will provide fast implementations of common array and
linear algebra operations. Note that here "statically sized" means that the
size can be determined from the *type*, and "static" does **not** necessarily
Expand All @@ -27,40 +27,42 @@ Full documentation can be found [here](https://JuliaArrays.github.io/StaticArray
## Speed

The speed of *small* `SVector`s, `SMatrix`s and `SArray`s is often > 10 × faster
than `Base.Array`. See this simplified benchmark (or see the full results [here](https://github.com/andyferris/StaticArrays.jl/blob/master/perf/bench10.txt)):
than `Base.Array`. For example, here's a
[microbenchmark](perf/README_benchmarks.jl) showing some common operations.

```
============================================
Benchmarks for 3×3 Float64 matrices
============================================

Matrix multiplication -> 8.2x speedup
Matrix multiplication (mutating) -> 3.1x speedup
Matrix addition -> 45x speedup
Matrix addition (mutating) -> 5.1x speedup
Matrix determinant -> 170x speedup
Matrix inverse -> 125x speedup
Matrix symmetric eigendecomposition -> 82x speedup
Matrix Cholesky decomposition -> 23.6x speedup
Matrix multiplication -> 5.9x speedup
Matrix multiplication (mutating) -> 1.8x speedup
Matrix addition -> 33.1x speedup
Matrix addition (mutating) -> 2.5x speedup
Matrix determinant -> 112.9x speedup
Matrix inverse -> 67.8x speedup
Matrix symmetric eigendecomposition -> 25.0x speedup
Matrix Cholesky decomposition -> 8.8x speedup
Matrix LU decomposition -> 6.1x speedup
Matrix QR decomposition -> 65.0x speedup
```

These results improve significantly when using `julia -O3` with immutable static
arrays, as the extra optimization results in surprisingly good SIMD code.
These numbers were generated on an Intel i7-7700HQ using Julia-1.2. As with all
synthetic benchmarks, the speedups you see here should only be taken as very
roughly indicative of the speedup you may see in real code. When in doubt,
benchmark your real application!

Note that in the current implementation, working with large `StaticArray`s puts a
lot of stress on the compiler, and becomes slower than `Base.Array` as the size
increases. A very rough rule of thumb is that you should consider using a
normal `Array` for arrays larger than 100 elements. For example, the performance
crossover point for a matrix multiply microbenchmark seems to be about 11x11 in
julia 0.5 with default optimizations.
normal `Array` for arrays larger than 100 elements.


## Quick start

Add *StaticArrays* from the [Pkg REPL](https://docs.julialang.org/en/latest/stdlib/Pkg/#Getting-Started-1), i.e., `pkg> add StaticArrays`. Then:
```julia
Pkg.add("StaticArrays") # or Pkg.clone("https://github.com/JuliaArrays/StaticArrays.jl")
using StaticArrays
using LinearAlgebra
using StaticArrays

# Create an SVector using various forms, using constructors, functions or macros
v1 = SVector(1, 2, 3)
Expand Down Expand Up @@ -104,7 +106,8 @@ rand(MMatrix{20,20}) * rand(MMatrix{20,20}) # large matrices can use BLAS
eigen(m3) # eigen(), etc uses specialized algorithms up to 3×3, or else LAPACK

# Static arrays stay statically sized, even when used by Base functions, etc:
typeof(eigen(m3)) == Eigen{Float64,Float64,SArray{Tuple{3,3},Float64,2,9},SArray{Tuple{3},Float64,1,3}}
typeof(eigen(m3).vectors) == SMatrix{3,3,Float64,9}
typeof(eigen(m3).values) == SVector{3,Float64}

# similar() returns a mutable container, while similar_type() returns a constructor:
typeof(similar(m3)) == MArray{Tuple{3,3},Int64,2,9} # (final parameter is length = 9)
Expand Down Expand Up @@ -138,27 +141,7 @@ performance optimizations may be made when the size of the array is known to the
compiler. One example of this is by loop unrolling, which has a substantial
effect on small arrays and tends to automatically trigger LLVM's SIMD
optimizations. Another way performance is boosted is by providing specialized
methods for `det`, `inv`, `eig` and `chol` where the algorithm depends on the
methods for `det`, `inv`, `eigen` and `cholesky` where the algorithm depends on the
precise dimensions of the input. In combination with intelligent fallbacks to
the methods in Base, we seek to provide a comprehensive support for statically
sized arrays, large or small, that hopefully "just works".

## Relationship to *FixedSizeArrays* and *ImmutableArrays*

Several existing packages for statically sized arrays have been developed for
Julia, noteably *FixedSizeArrays* and *ImmutableArrays* which provided signficant
inspiration for this package. Upon consultation, it has been decided to move
forward with *StaticArrays* which has found a new home in the *JuliaArrays*
github organization. It is recommended that new users use this package, and
that existing dependent packages consider switching to *StaticArrays* sometime
during the life-cycle of Julia v0.5.

You can try `using StaticArrays.FixedSizeArrays` to add some compatibility
wrappers for the most commonly used features of the *FixedSizeArrays* package,
such as `Vec`, `Mat`, `Point` and `@fsa`. These wrappers do not provide a
perfect interface, but may help in trying out *StaticArrays* with pre-existing
code.

Furthermore, `using StaticArrays.ImmutableArrays` will let you use the typenames
from the *ImmutableArrays* package, which does not include the array size as a
type parameter (e.g. `Vector3{T}` and `Matrix3x3{T}`).
62 changes: 62 additions & 0 deletions perf/README_benchmarks.jl
Original file line number Diff line number Diff line change
@@ -0,0 +1,62 @@
using BenchmarkTools
using LinearAlgebra
using StaticArrays

add!(C, A, B) = (C .= A .+ B)

function simple_bench(N, T=Float64)
A = rand(T,N,N)
A = A'*A
B = copy(A)
SA = SMatrix{N,N}(A)
MA = MMatrix{N,N}(A)
MB = copy(MA)

print("""
============================================
Benchmarks for $N×$N $T matrices
============================================
""")
ops = [
("Matrix multiplication ", *, (A, A), (SA, SA)),
("Matrix multiplication (mutating) ", mul!, (B, A, A), (MB, MA, MA)),
("Matrix addition ", +, (A, A), (SA, SA)),
("Matrix addition (mutating) ", add!, (B, A, A), (MB, MA, MA)),
("Matrix determinant ", det, (A,), (SA,)),
("Matrix inverse ", inv, (A,), (SA,)),
("Matrix symmetric eigendecomposition", eigen, (A,), (SA,)),
("Matrix Cholesky decomposition ", cholesky, (A,), (SA,)),
("Matrix LU decomposition ", lu, (A,), (SA,)),
("Matrix QR decomposition ", qr, (A,), (SA,)),
]
for (name, op, Aargs, SAargs) in ops
# We load from Ref's here to avoid the compiler completely removing the
# benchmark in some cases.
#
# Like any microbenchmark, the speedups you see here should only be
# taken as roughly indicative of the speedup you may see in real code.
if length(Aargs) == 1
A1 = Ref(Aargs[1])
SA1 = Ref(SAargs[1])
speedup = @belapsed($op($A1[])) / @belapsed($op($SA1[]))
elseif length(Aargs) == 2
A1 = Ref(Aargs[1])
A2 = Ref(Aargs[2])
SA1 = Ref(SAargs[1])
SA2 = Ref(SAargs[2])
speedup = @belapsed($op($A1[], $A2[])) / @belapsed($op($SA1[], $SA2[]))
elseif length(Aargs) == 3
A1 = Ref(Aargs[1])
A2 = Ref(Aargs[2])
A3 = Ref(Aargs[3])
SA1 = Ref(SAargs[1])
SA2 = Ref(SAargs[2])
SA3 = Ref(SAargs[3])
speedup = @belapsed($op($A1[], $A2[], $A3[])) / @belapsed($op($SA1[], $SA2[], $SA3[]))
else
end
println(name*" -> $(round(speedup, digits=1))x speedup")
end
end

simple_bench(3)