Skip to content

keysyms: Check clashes between keysyms names and keywords #598

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged

Conversation

wismill
Copy link
Member

@wismill wismill commented Jan 17, 2025

Due to how our parser is implemented, keysyms names that are also valid keywords require special handling.

Added a check for these clashes in the keysym generator. The only current clash, section, is already handled. Note that it means that section, Section, sEcTiOn all parse to the same keysym.

Hopefully we will never have a clash again, but the keysyms are not a frozen set.


I got puzzled by the following excerpt in parser.y and decided to investigate.

KeySym          :       IDENT
                …
                |       SECTION { $$ = XKB_KEY_section; }

@wismill wismill added compile-keymap Indicates a need for improvements or additions to keymap compilation keysyms generator labels Jan 17, 2025
@wismill wismill added this to the 1.8.0 milestone Jan 17, 2025
@wismill wismill requested review from bluetech and whot January 17, 2025 06:29
@wismill wismill force-pushed the keysyms/check-xkbcomp-keywords-clashes branch from 076fbf2 to 177cf71 Compare January 17, 2025 06:32
@wismill wismill removed this from the 1.8.0 milestone Jan 17, 2025
@wismill wismill force-pushed the keysyms/check-xkbcomp-keywords-clashes branch from 177cf71 to cc68776 Compare January 17, 2025 17:58
Copy link
Member

@bluetech bluetech left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good to me!

@bluetech
Copy link
Member

I got puzzled by the following excerpt in parser.y and decided to investigate.

Perhaps also add a comment on this line?

Due to how our parser is implemented, keysyms names that are also valid
keywords require special handling.

Added a check for these clashes in the keysym generator. The only
current clash, `section`, is already handled. Note that it means that
e.g. `section`, `Section` and `sEcTiOn` all parse to the same keysym.
This side effect is fine here, because *currently* there is no other
keysym that clashes with any possible of the case variation of `section`.
But in order to be extra cautious, we now test thoses clashes too.

Hopefully we will never have a clash again, but while it is unlikely
that we modify the keywords, the keysyms are not a frozen set.
@wismill wismill force-pushed the keysyms/check-xkbcomp-keywords-clashes branch from cc68776 to d5e3386 Compare January 19, 2025 11:26
@wismill
Copy link
Member Author

wismill commented Jan 19, 2025

I got puzzled by the following excerpt in parser.y and decided to investigate.

Perhaps also add a comment on this line?

@bluetech Good point, added.

As a general note, if I had to rewrite the parser, I would opt for the “parse, don’t validate” approach.

I wonder if it would make sense to switch to using Tree-sitter someday. I would love to get as a bonus correct syntax highlighting for XKB files.

@bluetech
Copy link
Member

As a general note, if I had to rewrite the parser, I would opt for the “parse, don’t validate” approach.

The article is really good. But a yacc grammar is pretty much "parse" no? I'm curious which part you consider the "validate".

@bluetech
Copy link
Member

I wonder if it would make sense to switch to using Tree-sitter someday. I would love to get as a bonus correct syntax highlighting for XKB files.

Maybe these days there is an AI clever enough to translate a yacc grammar to a tree-sitter grammar. In my experience they are quite good with translations like this.

@wismill
Copy link
Member Author

wismill commented Jan 20, 2025

As a general note, if I had to rewrite the parser, I would opt for the “parse, don’t validate” approach.

The article is really good. But a yacc grammar is pretty much "parse" no? I'm curious which part you consider the "validate".

@bluetech My point is that we use a bunch of Expr that require further check on their type after the yacc parsing. So Expr is too general and we may end up with error with no reference to erroneous file/line. Consider the silly:

xkb_keymap { };
    xkb_keycodes { };
    xkb_types {
        type "XXX" {
        modifiers[1234] = Shift;
        map[Shift] = Level2;
        level_name[Level1] = "Base";
        level_name[Level2] = "Number";
        };
    };
    xkb_compat { };
    xkb_symbols { };
};

You would just get the following warning, without any reference to the erroneous line:

xkbcommon: WARNING: The modifiers field of a key type is not an array; Illegal array subscript ignored

Imagine now having this in an included key type file: really not ideal for debugging. But at this stage we do not have parser context anymore!

On the other hand, the following file will fail with a better (but still incomplete1) error2:

xkb_keymap {
    xkb_keycodes { 
      <};> = 123;
      <🦆> = 456;
    };
    xkb_types { };
    xkb_compat { };
    xkb_symbols { };
};

That said, I guess that fully parsing a section is slower than the current approach so any include would get slower with the parse, don’t validate approach applied strictly.

Footnotes

  1. (unknown file) instead of the path of the relevant file.

  2. Guess what line is erroneous? The <🦆> one, not the <};> one…

@wismill wismill added this to the 1.9.0 milestone Jan 22, 2025
@wismill wismill merged commit 4ac2226 into xkbcommon:master Jan 24, 2025
4 checks passed
@wismill wismill deleted the keysyms/check-xkbcomp-keywords-clashes branch January 24, 2025 06:30
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
compile-keymap Indicates a need for improvements or additions to keymap compilation keysyms generator
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants