Skip to content

"Consume qualified rule" section in the "CSS Syntax Level 3" spec. appears to express same step _twice_, an omission? [css-syntax] #10119

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
amn opened this issue Mar 22, 2024 · 3 comments
Labels
Closed as Question Answered Used when the issue is more of a question than a problem, and it's been answered. css-syntax-3

Comments

@amn
Copy link

amn commented Mar 22, 2024

Reading https://www.w3.org/TR/css-syntax-3/#consume-qualified-rule I am confused as to why there is seemingly more or less identical steps described under "Repeatedly consume next input token", the first step is written thus:

<{-token>
Consume a simple block and assign it to the qualified rule’s block. Return the qualified rule.

Fair, enough, but then immediately after another is written thus:

simple block with an associated token of <{-token>
Assign the block to the qualified rule’s block. Return the qualified rule.

If we disregard the fact that it's not at all clear how the condition quoted above is related to the sentence "Repeatedly consume next input token" that precedes the steps, it appears to basically express the same procedure as the first, regardless, albeit in a less comprehensible form if you ask me.

Is this an omission, and at any rate, can someone in the know clarify this issue?

If the second parable (step) is removed entirely, then in fact, the list of steps looks entirely comprehensible.

@amn amn changed the title "Consume qualified rule" section in the spec. appears to mention consumption of simple block _twice_? [css-syntax] "Consume qualified rule" section in the "CSS Syntax Level 3" spec. appears to express same step _twice_, an omission? [css-syntax] Mar 22, 2024
@tabatkins
Copy link
Member

I recommend reading the Editor's Draft for now, there's been significant editorial rewriting of the specification since the 2021 publication. (I do need to republish, tho.)


That said, the old text is indeed expressing two different things - a { token or a simple block. The algorithm at the time could be invoked with either a raw token list, or a token list that had already been thru the consumption process once before (and thus now contained other types of component values, like simple blocks). This is no longer the case, tho.

@tabatkins tabatkins added css-syntax-3 Closed as Question Answered Used when the issue is more of a question than a problem, and it's been answered. labels Mar 22, 2024
@amn
Copy link
Author

amn commented Mar 23, 2024

I'll have to start over with implementing tokenization and parsing if I am to switch to the editor's draft, no? I don't think I can expect to produce a compliant or even "working" parser/tokenizer were I to pick and choose sections from either depending on which one reads better for the particular section? Like using "consume qualified rule" section from the former and everything else from the latter (which is what I have been using exclusively so far). Isn't that a sensible assumption on my part -- would you suggest I put away the published version and start from the top of the editor's draft?

Regarding the problematic parable, I understand what you were trying to explain about the algorithm, but I fail to see how it would have worked anyway -- a token list is a token list, "raw" or not, as far as I have understood the parsing and tokenization procedures from the specification, in general, tokenization produces tokens, in sequence, while parsing produces objects like rules, blocks etc. What I am trying to say here is that I don't see it being expressed what even the algorithm would do when having a current input token on one hand and a block on the other. I can understand there is some implied condition for comparing the two, but that too isn't expressed at all. I mean how would I compare my current input token with a block, simple or otherwise, to determine whether to take the associated action ("Assign the block to the qualified rule’s block. Return the qualified rule.")? All this said, I can see the draft does define "component values" more specifically than original version:

A token stream is a struct representing a stream of tokens and/or component values.

component value
A component value is one of the preserved tokens, a function, or a simple block.

@tabatkins
Copy link
Member

Even the /TR version you were originally looking at defines "next input token" in that way:

The token or component value following the current input token in the list of tokens produced by the tokenizer.

Some of the parsing algorithms do a simple run over the token stream, assembling a list of component values (which can be tokens or things like blocks) until they hit the ending token they're looking for, then pass that list into another parsing algorithm to actually do something useful with. So those algos need to be prepared to see a list of unprocessed tokens (which can contain things like a { token) or a list of processed tokens (which will instead contain a simple block holding everything between the original { token and the matching } token).

But again, this is moot now in the current text, as I no longer invoke any algorithms in that way. (I needed to do so because I couldn't tell how to parse the contents of a block without contextual knowledge, but since then we've made some changes that do allow me to unambiguously parse all blocks.)

I'll have to start over with implementing tokenization and parsing if I am to switch to the editor's draft, no?

Yes, you should use one or the other, and by "or" I really mean "you should use the ED, it's up to date and matches what browsers are expected to do". The differences are minimal, tho - other than an editorial rewrite, the changes since the TR publication were mostly focused around getting CSS Nesting to work properly.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Closed as Question Answered Used when the issue is more of a question than a problem, and it's been answered. css-syntax-3
Projects
None yet
Development

No branches or pull requests

2 participants