Skip to content

LWG-4186 regex_traits::transform_primary mistakenly detects typeid of a function #5291

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
StephanTLavavej opened this issue Feb 16, 2025 · 3 comments · Fixed by #5444
Closed
Labels
fixed Something works now, yay! LWG Library Working Group issue regex meow is a substring of homeowner

Comments

@StephanTLavavej
Copy link
Member

LWG-4186 regex_traits::transform_primary mistakenly detects typeid of a function

@StephanTLavavej StephanTLavavej added LWG Library Working Group issue regex meow is a substring of homeowner labels Feb 16, 2025
@github-project-automation github-project-automation bot moved this to Available in STL LWG Issues Feb 16, 2025
@muellerj2
Copy link
Contributor

More generally, regex_traits::transform_primary() currently implements the general traits requirement in [re.req]/20 and not the specified implementation in [re.traits]/7. Besides the missing comparison with typeid, this also means that diacritics are not properly handled in (most?) non-C locales.

If we want to implement the specification, we should probably add variants of _Strxfrm() and _Wcsxfrm() to the import library that call __crtLCMapStringA/W with appropriate flags for non-C locales (maybe LCMAP_SORTKEY | LINGUISTIC_IGNORECASE | LINGUISTIC_IGNOREDIACRITIC | NORM_IGNOREKANA | NORM_IGNOREWIDTH or something similar).

If we add these new variants of _Strxfrm() and _Wcsxfrm(), we should make sure that their return values are consistent unlike the current implementations of_Strxfrm() and _Wcxsfrm() (#5210).

@muellerj2
Copy link
Contributor

std::locale doesn't actually construct any facets of type std::collate_byname<charT>. It constructs facets of type std::collate<charT> instead. See: https://godbolt.org/z/Te4rcM173

So comparing with typeid(std::collate_byname<charT>) only doesn't make sense. We either have to correct the facets constructed by std::locale (would this break ABI?) or compare with typeid(std::collate<charT>) as well.

@StephanTLavavej
Copy link
Member Author

We either have to correct the facets constructed by std::locale (would this break ABI?)

I don't understand std::locale well enough to say whether it would be binary-compatible. All I know is that std::locale is the most fragile part of the library and changes have had a high risk of damaging it, even back when we could break ABI every major version.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
fixed Something works now, yay! LWG Library Working Group issue regex meow is a substring of homeowner
Projects
Status: Done
2 participants