-
Notifications
You must be signed in to change notification settings - Fork 571
use Eigen as a BLAS alternative #858
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
How does it compare performance-wise? |
I did a quick test with a T35 net and it is about 8-10% quicker than |
Builds and runs on Android. Great news! Now we've got two back-ends that are able to build through cross-compilation. On my first tests it appears to be just a little bit slower than OpenBlas:
|
Android builds :D lc0-eigen-armv7a.zip |
@@ -44,7 +56,7 @@ void Convolution1::Forward(const size_t batch_size, const size_t input_channels, | |||
|
|||
const float* batch_input = input + i * kSquares * input_channels; | |||
float* batch_output = output + i * kSquares * output_channels; | |||
|
|||
#ifndef USE_EIGEN |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could BLAS and Eigen in theory co-exist?
Then if would be better to have templated functions (e.g. <bool is_eigen>
) which would be shown up as different backends.
See for example:
cudnn vs cudnn-fp16
https://github.com/LeelaChessZero/lc0/blob/master/src/neural/cuda/network_cudnn.cc#L805
#5 (comment)
tensorflow vs tensorflow-cpu
https://github.com/LeelaChessZero/lc0/blob/master/src/neural/network_tf.cc#L342
May be not that straighforward though as different header files are needed (so #ifdefs will be needed in any case), and calling non-included functions in if(false) {}
won't compile.
Calling them from non-instanciated function template specializations will work though:
template<>
DoStuff<false /* is eigen*/>() {
known_functions();
}
template<>
DoStuff<true /* is eigen*/>() {
unknown_functions(); // Is fine
}
#if defined(HAVE_EIGEN)
REGISTER_NETWORK("eigen", MakeBlasNetwork<true>, 90)
#endif
#if defined(HAVE_BLAS)
REGISTER_NETWORK("blas", MakeBlasNetwork<false>, 90)
#endif
Not sure if easy enough to be bothered with, but that way it would be nicer.
Also there is this brute-force way to allow co-existance:
template<bool is_eigen> MyFunction() {
#ifdef HAVE_EIGEN
if (is_eigen) {
// Do eigen stuff.
}
#endif
#ifdef HAVE_BLAS
if (!is_eigen) {
// Do blas stuff.
}
#endif
}
#if defined(HAVE_EIGEN)
REGISTER_NETWORK("eigen", MakeBlasNetwork<true>, 90)
#endif
#if defined(HAVE_BLAS)
REGISTER_NETWORK("blas", MakeBlasNetwork<false>, 90)
#endif
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't think there is a reason for this: eigen is only suitable for the build machine, so blas is the way to go for redistributable cpu only binaries. The only use case I can see is for benchmarking between eigen and mkl on a specific machine - is there any other use case?
This is a port of leela-zero/leela-zero#1692.