Skip to content

Commit 2c452e0

Browse files
authored
Remove IsGlyphFullWidth from InputBuffer (#17680)
In several places the old conhost codebase appears to assume that any wide glyph is represented by two codepoints. This is probably an artifact of the ASCII/DBCS split that conhost used to have. When conhost got merged into a single UCS2-aware application, this artifact was apparently never properly resolved. To my knowledge there are at least two places where this assumption exists: The clipboard code which translates non-wide non-ascii characters to Alt-numpad sequences, and this code. Both are wrong. This is because in a Unicode-context there's no correlation between the number of codepoints and the width of the glyph, even with UCS2. In a post-UCS2-world the correct check is for surrogate pairs, as they must be avoided for the same reason DBCS were avoided. One could consider this a breaking change of the API, as this can now result in repeat counts >1 for wide glyphs. If someone complained about this change in behavior, I'd probably not change it back, as narrow complex Unicode characters exist too.
1 parent d4c1dad commit 2c452e0

File tree

2 files changed

+11
-19
lines changed

2 files changed

+11
-19
lines changed

src/host/inputBuffer.cpp

+3-8
Original file line numberDiff line numberDiff line change
@@ -10,7 +10,6 @@
1010
#include "misc.h"
1111
#include "stream.h"
1212
#include "../interactivity/inc/ServiceLocator.hpp"
13-
#include "../types/inc/GlyphWidth.hpp"
1413

1514
#define INPUT_BUFFER_DEFAULT_INPUT_MODE (ENABLE_LINE_INPUT | ENABLE_PROCESSED_INPUT | ENABLE_ECHO_INPUT | ENABLE_MOUSE_INPUT)
1615

@@ -804,13 +803,9 @@ bool InputBuffer::_CoalesceEvent(const INPUT_RECORD& inEvent) noexcept
804803
(lastKey.wVirtualScanCode == inKey.wVirtualScanCode || WI_IsFlagSet(inKey.dwControlKeyState, NLS_IME_CONVERSION)) &&
805804
lastKey.uChar.UnicodeChar == inKey.uChar.UnicodeChar &&
806805
lastKey.dwControlKeyState == inKey.dwControlKeyState &&
807-
// TODO:GH#8000 This behavior is an import from old conhost v1 and has been broken for decades.
808-
// This is probably the outdated idea that any wide glyph is being represented by 2 characters (DBCS) and likely
809-
// resulted from conhost originally being split into a ASCII/OEM and a DBCS variant with preprocessor flags.
810-
// You can't update the repeat count of such a A,B pair, because they're stored as A,A,B,B (down-down, up-up).
811-
// I believe the proper approach is to store pairs of characters as pairs, update their combined
812-
// repeat count and only when they're being read de-coalesce them into their alternating form.
813-
!IsGlyphFullWidth(inKey.uChar.UnicodeChar))
806+
// A single repeat count cannot represent two INPUT_RECORDs simultaneously,
807+
// and so it cannot represent a surrogate pair either.
808+
!til::is_surrogate(inKey.uChar.UnicodeChar))
814809
{
815810
lastKey.wRepeatCount += inKey.wRepeatCount;
816811
return true;

src/host/ut_host/InputBufferTests.cpp

+8-11
Original file line numberDiff line numberDiff line change
@@ -219,22 +219,19 @@ class InputBufferTests
219219
}
220220
}
221221

222-
TEST_METHOD(InputBufferDoesNotCoalesceFullWidthChars)
222+
TEST_METHOD(InputBufferDoesNotCoalesceSurrogatePairs)
223223
{
224224
InputBuffer inputBuffer;
225-
WCHAR hiraganaA = 0x3042; // U+3042 hiragana A
226-
auto record = MakeKeyEvent(true, 1, hiraganaA, 0, hiraganaA, 0);
227225

228-
// send a bunch of identical events
229-
inputBuffer.Flush();
230-
for (size_t i = 0; i < RECORD_INSERT_COUNT; ++i)
231-
{
232-
VERIFY_IS_GREATER_THAN(inputBuffer.Write(record), 0u);
233-
VERIFY_ARE_EQUAL(inputBuffer._storage.back(), record);
234-
}
226+
// U+1F44D thumbs up emoji
227+
inputBuffer.Write(MakeKeyEvent(true, 1, 0xD83D, 0, 0xD83D, 0));
228+
inputBuffer.Write(MakeKeyEvent(true, 1, 0xDC4D, 0, 0xDC4D, 0));
229+
230+
// Should not coalesce despite otherwise matching perfectly.
231+
inputBuffer.Write(MakeKeyEvent(true, 1, 0xDC4D, 0, 0xDC4D, 0));
235232

236233
// The events shouldn't be coalesced
237-
VERIFY_ARE_EQUAL(inputBuffer.GetNumberOfReadyEvents(), RECORD_INSERT_COUNT);
234+
VERIFY_ARE_EQUAL(3u, inputBuffer.GetNumberOfReadyEvents());
238235
}
239236

240237
TEST_METHOD(CanFlushAllOutput)

0 commit comments

Comments
 (0)