Skip to content

Base64 functions #3350

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 9 commits into from
Nov 23, 2018
Merged

Base64 functions #3350

merged 9 commits into from
Nov 23, 2018

Conversation

alex-krash
Copy link
Contributor

I hereby agree to the terms of the CLA available at: https://yandex.ru/legal/cla/?lang=en

Copy link
Member

@alexey-milovidov alexey-milovidov left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could you please also attach some performance reports (along with perf top)?

@alexey-milovidov
Copy link
Member

@alex-krash
Copy link
Contributor Author

@alexey-milovidov , regarding failure on Yandex CI, I see error in each build:

CMake Error at contrib/base64-cmake/CMakeLists.txt:16 (add_library):
  Cannot find source file:

    /build/contrib/base64/lib/lib.c

  Tried extensions .c .C .c++ .cc .cpp .cxx .m .M .mm .h .hh .h++ .hm .hpp
  .hxx .in .txx

Was the submodule cloned? I have no another idea on failure, comparing to Travis

@alesapin
Copy link
Member

@alex-krash The failure is bug in CI system. Submodule was not cloned.

@alesapin
Copy link
Member

CI fixed.

@alexey-milovidov
Copy link
Member

@alex-krash Only minor changes required.

@alex-krash
Copy link
Contributor Author

@alexey-milovidov, yeah, I am finishing.
There is just one strange bug. I am measuring the performance of different implementations, and when using AVX2 and string length multiple 32, it causes crash. If I fail to debug, will ask for your assistance.

@alex-krash
Copy link
Contributor Author

Seems to be finished.
Here are benchmark results (CMake Release profile):
Reference query:

SELECT count()
FROM 
(
    SELECT base64Encode(materialize('This is a long string to test ClickHouse base64 functions impl..'))
    FROM numbers(20000000) 
)

Without base64Encode() call (just materialize()):

1 rows in set. Elapsed: 0.714 sec. Processed 20.05 million rows, 160.43 MB (28.10 million rows/s., 224.82 MB/s.)

Codecs:

AVX:   1 rows in set. Elapsed: 1.515 sec. Processed 20.05 million rows, 160.43 MB (13.24 million rows/s., 105.90 MB/s.) 
SSE42: 1 rows in set. Elapsed: 1.531 sec. Processed 20.05 million rows, 160.43 MB (13.10 million rows/s., 104.78 MB/s.) 
SSE41: 1 rows in set. Elapsed: 1.538 sec. Processed 20.05 million rows, 160.43 MB (13.04 million rows/s., 104.30 MB/s.)
SSSE3: 1 rows in set. Elapsed: 1.545 sec. Processed 20.05 million rows, 160.43 MB (12.98 million rows/s., 103.85 MB/s.)
PLAIN: 1 rows in set. Elapsed: 2.248 sec. Processed 20.05 million rows, 160.43 MB (8.92 million rows/s., 71.37 MB/s.)

All vectorized instructions shows nearly same performance, actually.
A code snippet to enable quick comparison (algorithm is switched via environment variable BASE64_IMPL):

diff --git a/dbms/src/Functions/FunctionBase64Conversion.h b/dbms/src/Functions/FunctionBase64Conversion.h
index 418de6e..3f81662 100644
--- a/dbms/src/Functions/FunctionBase64Conversion.h
+++ b/dbms/src/Functions/FunctionBase64Conversion.h
@@ -159,7 +159,30 @@ public:
 private:
     static int getCodec()
     {
-        return 0;
+        auto env_var = std::getenv("BASE64_IMPL");
+
+        if(!env_var) {
+            return 0;
+        }
+
+        std::string value = std::string(env_var);
+
+        if("AVX2" == value) {
+            return BASE64_FORCE_AVX2;
+        } else if("AVX" == value) {
+            return BASE64_FORCE_AVX;
+        } else if("SSE42" == value) {
+            return BASE64_FORCE_SSE42;
+        } else if("SSE41" == value) {
+            return BASE64_FORCE_SSE41;
+        } else if("SSSE3" == value) {
+            return BASE64_FORCE_SSSE3;
+        } else if("PLAIN" == value) {
+            return BASE64_FORCE_PLAIN;
+        }
+        else {
+            return 0;
+        }
     }
 };
 }

@alexey-milovidov alexey-milovidov merged commit 18400ad into ClickHouse:master Nov 23, 2018
alexey-milovidov added a commit that referenced this pull request Nov 23, 2018
@alexey-milovidov
Copy link
Member

Ok.

The codec is correctly selected on my test server (AVX on E5-2650 v2).

SELECT count() FROM hits_1000m_transformed WHERE base64Decode(base64Encode(URL)) != URL
  • 5.72 GB/s.

TODO: check what happens if I compile on a machine without certain instruction set and run on machine with that instruction set.

But when I manually write

    static int getCodec()
    {
        return BASE64_FORCE_SSE42;
    }

Then is will dispatch on every iteration (codec_choose) and performance drops to as low as 1.03 GB/s. This is strange and requires further investigation.

@alexey-milovidov
Copy link
Member

All vectorized instructions shows nearly same performance, actually.

On your query I also have the same results:
Queries executed: 71.

170.397 MiB/s for SSE42
181.017 MiB/s for AVX

@alexey-milovidov
Copy link
Member

Then is will dispatch on every iteration (codec_choose) and performance drops to as low as 1.03 GB/s.

It happens with multithreaded query execution:

  • single thread: 490 MiB/s.
  • 16 threads: 988 MiB/s.

Maybe this is due to poor implementation of codec_choose function.

@alexey-milovidov
Copy link
Member

This is due to write to a global variable ("false sharing") in "codec_choose_forced" method.

alexey-milovidov added a commit that referenced this pull request Nov 23, 2018
@alexey-milovidov
Copy link
Member

(Solution: simply do not use "flag" argument of functions.)

alexey-milovidov added a commit that referenced this pull request Nov 23, 2018
alexey-milovidov added a commit that referenced this pull request Dec 20, 2018
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants