-
Notifications
You must be signed in to change notification settings - Fork 1.2k
LUCENE-10315: Speed up BKD leaf block ids codec by a 512 ints ForUtil #541
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Merged
Merged
Changes from all commits
Commits
Show all changes
37 commits
Select commit
Hold shift + click to select a range
88f0bd5
stash
gf2121 141dc40
stash
gf2121 2bfe1aa
check
gf2121 8b79de0
add forutil
gf2121 05d7ba2
name codes
gf2121 d80bb28
note
gf2121 e4bd039
for util
gf2121 92bfc83
reduce code num
gf2121 73d00ff
spotless
gf2121 4648f83
limit bpv to 16/24/32 and using floor delta codec
gf2121 634e56e
make diff a bit more beautiful
gf2121 ffdfb26
iter
gf2121 92e6710
assert count
gf2121 8c70d9c
make writer final
gf2121 cafe4fc
iter
gf2121 a387949
try to make remainder also SIMD
gf2121 7aa92ea
plus when expand
gf2121 6ff0cec
judge cluster should not rely on sorted
gf2121 1f09b3a
add an assert
gf2121 2708036
assert is making CI angry, remove
gf2121 f44f260
use int
gf2121 4a28b25
Merge remote-tracking branch 'origin/main' into LUCENE-10315
gf2121 0b452da
spotless
gf2121 bf78353
iter on review advice
gf2121 ccdae2f
make bkd foruti flexible
gf2121 3b7ba85
resolve conflict
gf2121 6a84978
unset int buffer
gf2121 09f2999
add some tests for read Ints
gf2121 51056c2
spotless
gf2121 aa73d07
iter
gf2121 9e084b8
fix tests and add some notes for tmp length
gf2121 38abc4c
fix typo
gf2121 4db27a2
spotless
gf2121 8608db0
iter on feed back
gf2121 4893afc
Update lucene/core/src/java/org/apache/lucene/util/bkd/DocIdsWriter.java
gf2121 be14afa
Update lucene/core/src/java/org/apache/lucene/util/bkd/DocIdsWriter.java
gf2121 3376f9a
Merge remote-tracking branch 'origin/main' into LUCENE-10315
gf2121 File filter
Filter by extension
Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
There are no files selected for viewing
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -169,6 +169,20 @@ public void readLongs(long[] dst, int offset, int length) throws IOException { | |
} | ||
} | ||
|
||
/** | ||
* Reads a specified number of ints into an array at the specified offset. | ||
* | ||
* @param dst the array to read bytes into | ||
* @param offset the offset in the array to start storing ints | ||
* @param length the number of ints to read | ||
*/ | ||
public void readInts(int[] dst, int offset, int length) throws IOException { | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Can you add javadocs? |
||
Objects.checkFromIndexSize(offset, length, dst.length); | ||
for (int i = 0; i < length; ++i) { | ||
dst[offset + i] = readInt(); | ||
} | ||
} | ||
|
||
/** | ||
* Reads a specified number of floats into an array at the specified offset. | ||
* | ||
|
108 changes: 108 additions & 0 deletions
108
lucene/core/src/java/org/apache/lucene/util/bkd/BKDForUtil.java
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,108 @@ | ||
/* | ||
* Licensed to the Apache Software Foundation (ASF) under one or more | ||
* contributor license agreements. See the NOTICE file distributed with | ||
* this work for additional information regarding copyright ownership. | ||
* The ASF licenses this file to You under the Apache License, Version 2.0 | ||
* (the "License"); you may not use this file except in compliance with | ||
* the License. You may obtain a copy of the License at | ||
* | ||
* http://www.apache.org/licenses/LICENSE-2.0 | ||
* | ||
* Unless required by applicable law or agreed to in writing, software | ||
* distributed under the License is distributed on an "AS IS" BASIS, | ||
* WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. | ||
* See the License for the specific language governing permissions and | ||
* limitations under the License. | ||
*/ | ||
package org.apache.lucene.util.bkd; | ||
|
||
import java.io.IOException; | ||
import org.apache.lucene.store.DataInput; | ||
import org.apache.lucene.store.DataOutput; | ||
|
||
final class BKDForUtil { | ||
|
||
private final int[] tmp; | ||
|
||
BKDForUtil(int maxPointsInLeaf) { | ||
// For encode16/decode16, we do not need to use tmp array. | ||
// For encode24/decode24, we need a (3/4 * maxPointsInLeaf) length tmp array. | ||
// For encode32/decode32, we reuse the scratch in DocIdsWriter. | ||
// So (3/4 * maxPointsInLeaf) is enough here. | ||
final int len = (maxPointsInLeaf >>> 2) * 3; | ||
tmp = new int[len]; | ||
} | ||
|
||
void encode16(int len, int[] ints, DataOutput out) throws IOException { | ||
final int halfLen = len >>> 1; | ||
for (int i = 0; i < halfLen; ++i) { | ||
ints[i] = ints[halfLen + i] | (ints[i] << 16); | ||
} | ||
for (int i = 0; i < halfLen; i++) { | ||
out.writeInt(ints[i]); | ||
} | ||
if ((len & 1) == 1) { | ||
out.writeShort((short) ints[len - 1]); | ||
} | ||
} | ||
|
||
void encode32(int off, int len, int[] ints, DataOutput out) throws IOException { | ||
for (int i = 0; i < len; i++) { | ||
out.writeInt(ints[off + i]); | ||
} | ||
} | ||
|
||
void encode24(int off, int len, int[] ints, DataOutput out) throws IOException { | ||
final int quarterLen = len >>> 2; | ||
final int quarterLen3 = quarterLen * 3; | ||
for (int i = 0; i < quarterLen3; ++i) { | ||
tmp[i] = ints[off + i] << 8; | ||
} | ||
for (int i = 0; i < quarterLen; i++) { | ||
final int longIdx = off + i + quarterLen3; | ||
tmp[i] |= ints[longIdx] >>> 16; | ||
tmp[i + quarterLen] |= (ints[longIdx] >>> 8) & 0xFF; | ||
tmp[i + quarterLen * 2] |= ints[longIdx] & 0xFF; | ||
} | ||
for (int i = 0; i < quarterLen3; ++i) { | ||
out.writeInt(tmp[i]); | ||
} | ||
|
||
final int remainder = len & 0x3; | ||
for (int i = 0; i < remainder; i++) { | ||
out.writeInt(ints[quarterLen * 4 + i]); | ||
} | ||
} | ||
|
||
void decode16(DataInput in, int[] ints, int len, final int base) throws IOException { | ||
final int halfLen = len >>> 1; | ||
in.readInts(ints, 0, halfLen); | ||
for (int i = 0; i < halfLen; ++i) { | ||
int l = ints[i]; | ||
ints[i] = (l >>> 16) + base; | ||
ints[halfLen + i] = (l & 0xFFFF) + base; | ||
} | ||
if ((len & 1) == 1) { | ||
ints[len - 1] = Short.toUnsignedInt(in.readShort()) + base; | ||
} | ||
} | ||
|
||
void decode24(DataInput in, int[] ints, int len) throws IOException { | ||
final int quarterLen = len >>> 2; | ||
final int quarterLen3 = quarterLen * 3; | ||
in.readInts(tmp, 0, quarterLen3); | ||
for (int i = 0; i < quarterLen3; ++i) { | ||
ints[i] = tmp[i] >>> 8; | ||
} | ||
for (int i = 0; i < quarterLen; i++) { | ||
ints[i + quarterLen3] = | ||
((tmp[i] & 0xFF) << 16) | ||
| ((tmp[i + quarterLen] & 0xFF) << 8) | ||
| (tmp[i + quarterLen * 2] & 0xFF); | ||
} | ||
int remainder = len & 0x3; | ||
if (remainder > 0) { | ||
in.readInts(ints, quarterLen << 2, remainder); | ||
} | ||
} | ||
} |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Oops, something went wrong.
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
curFloatBufferViews
does not belong to this PR. I wonder if we should open a separate issue for this as it might lead to unknown bugs? what do you think @jpountzThere was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
+1 to open a separate issue