Skip to content

[WIP] Fix issue 166 #191

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Draft
wants to merge 7 commits into
base: mesonbuild
Choose a base branch
from

Conversation

Thomasb81
Copy link
Contributor

@Thomasb81 Thomasb81 commented Jul 14, 2024

Hello

Current draft push request is an attempt to fix : #166

First it remove new line added in preprocessing result. One part of the fix consist to make the preprocessing grammar to consume the training new line. The Ieee1800-2017 chapter 22.4 says

Only white space or a comment may appear on the same line as the `include compiler directive.

So we can remove new line on the line containing directive.
If included file does not end by new line, next line will appear on the last line from the included file. See tests/sv_pp/expected/include_many_dir/from_subdirectory.txt

Second part consist to enrich token build by antlr4 with original file location using file_line_map data.
But the result is not yet correct: current data structure does not allow to retrieve all information we need for every lines.

The CodePosition object has been update to reflect the fileName data. hdlConvertorAst would also need to be updated.

Thomasb81 added 7 commits July 6, 2024 15:00
enrich with file origine of the token.
- Create the associated factory to instruct lexer to create token with our custom token
The factory is responsible to use the file_line_map data.
- Enrich CodePosition with fileName field.
@Thomasb81
Copy link
Contributor Author

Thomasb81 commented Jul 14, 2024

Tests are failing because hdlConvertorAst requiere an update of Codeposition object.

@Nic30
Copy link
Owner

Nic30 commented Jun 10, 2025

@Thomasb81
Can you summarize the state of this issue, optionally can you give me info about what is need to be done?

@Thomasb81
Copy link
Contributor Author

Thomasb81 commented Jun 10, 2025

The CodePosition object has been update to reflect the fileName data. hdlConvertorAst would also need to be updated.

Refer to this class that need to be updated to add a fileName attribute.
https://github.com/Nic30/hdlConvertorAst/blob/19e4c60dab7614cc0fbd56f5e74e770395e7df25/hdlConvertorAst/hdlAst/utils.py#L3 to do not have issue at runtime ...

Having this python object definition in a different repository seems to create dependency circle ... it is not helping to do consistent change...
Actually to do such a change we need to first update hdlConvertorAst ...

I review this PR few day ago but have difficulties to remember

I think new line in preprocessed code that does not come from source code is spotted in src/verilogPreproc/out_buffer.cpp and possibly with the preprocessor lexer change ... expected in testsuite have been updated against this change.

The codePosition object is currently fill with data issue from preprocessor buffer. So we lost the origin of token which are later use in case of error or directly report to user through codePosition object.

The initial idea was to attach the origin filename to each token after preprocessing step so it serves all use-case without addition effort. Original Antlr token object : antlr4::commonToken attach only start/end line but keep the filename global. Which is an issue for verilogSV parser since your source code can be a composition of several files. Think about a macro define that contain a syntax error defined in file A being include in file B : you want to carry in the token that the faulty code is coming from file A at line where the macro is defined.

The idea would be to fix token data before they get used by the verilog/SV parser.

To do so the idea is to use our customized token hdlConvertorToken enrich with filename. So each token create by verilog/SV lexer will be used by verilog/SV parser will have reliable filename and start/end line.

The verilog/SV Lexer is customize with our hdlConvertorTokenFactory responsible to create our customized token using the fileLineMap as a first step, to only address include line change, in the lexer.

	        /*
		 * Customize Lexer with our hdlConvertorToken factory
		 * This allow us to enrich antlr4::CommonToken with filename on
		 * each token, then spread this kind of detail everywhere token are
		 * use.
		 * Dummy structure for vhdl, since no processing step exist.
		 * */
		verilog_pp::FileLineMap dummy_file_line_map;
		hdlConvertorTokenFactory factory;
		factory.setFileLineMap(&dummy_file_line_map);
		lexer.get()->setTokenFactory(factory.DEFAULT.get());

That's where I stop, a little bit lost ... filename is probably correctly propagate, but I guess line are still wrong. Probably I miss understand FileLineMap data or did not figure out how to use them...

That's why the PR is shared as is in draft in case someone want to have a look...

@Thomasb81
Copy link
Contributor Author

Also here #134 (comment) you mention to use smart_prt. to reference fileName in codePosition. Current code make a copy.

Nic30 added a commit to Nic30/hdlConvertorAst that referenced this pull request Jun 11, 2025
@Thomasb81 Thomasb81 mentioned this pull request Jun 15, 2025
@Thomasb81
Copy link
Contributor Author

...Initial though was I can re-use

using FileLineMap = std::vector<FileLineMapItem>;
create along out_buffer is created. But we can't...

Better to have proper and dedicated data structure, that is able to store the origin ( file/str to parse, included file or macro expansion) of each chunk of code put in the output buffer. So hdlConvertorTokenFactory::create can properly correct file, line and pos token attribute.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants