-
Notifications
You must be signed in to change notification settings - Fork 14.3k
[libclang/python] Expose Rewriter to the python binding #77269
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
Thank you for submitting a Pull Request (PR) to the LLVM Project! This PR will be automatically labeled and the relevant teams will be If you wish to, you can add reviewers by using the "Reviewers" section on this page. If this is not working for you, it is probably because you do not have write If you have received no comments on your PR for a week, you can request a review If you have further questions, they may be answered by the LLVM GitHub User Guide. You can also ask questions in a comment on this PR, on the LLVM Discord or on the forums. |
@llvm/pr-subscribers-clang Author: Jimmy Z (jimmy-zx) ChangesExposes Full diff: https://github.com/llvm/llvm-project/pull/77269.diff 2 Files Affected:
diff --git a/clang/bindings/python/clang/cindex.py b/clang/bindings/python/clang/cindex.py
index d780ee353a133c..ced449180d98fc 100644
--- a/clang/bindings/python/clang/cindex.py
+++ b/clang/bindings/python/clang/cindex.py
@@ -3531,6 +3531,61 @@ def cursor(self):
return cursor
+class Rewriter(ClangObject):
+ """
+ The Rewriter is a wrapper class around clang::Rewriter
+
+ It enables rewriting buffers.
+ """
+
+ @staticmethod
+ def create(tu):
+ """
+ Creates a new Rewriter
+ Parameters:
+ tu -- The translation unit for the target AST.
+ """
+ return Rewriter(conf.lib.clang_CXRewriter_create(tu))
+
+ def __init__(self, ptr):
+ ClangObject.__init__(self, ptr)
+
+ def __del__(self):
+ conf.lib.clang_CXRewriter_dispose(self)
+
+ def insertTextBefore(self, loc, insert):
+ """
+ Insert the specified string at the specified location in the original buffer.
+ """
+ conf.lib.clang_CXRewriter_insertTextBefore(self, loc, insert)
+
+ def replaceText(self, toBeReplaced, replacement):
+ """
+ This method replaces a range of characters in the input buffer with a new string.
+ """
+ conf.lib.clang_CXRewriter_replaceText(self, toBeReplaced, replacement)
+
+ def removeText(self, toBeRemoved):
+ """
+ Remove the specified text region.
+ """
+ conf.lib.clang_CXRewriter_removeText(self, toBeRemoved)
+
+ def overwriteChangedFiles(self):
+ """
+ Save all changed files to disk.
+
+ Returns 1 if any files were not saved successfully, returns 0 otherwise.
+ """
+ return conf.lib.clang_CXRewriter_overwriteChangedFiles(self)
+
+ def writeMainFileToStdOut(self):
+ """
+ Writes the main file to stdout.
+ """
+ conf.lib.clang_CXRewriter_writeMainFileToStdOut(self)
+
+
# Now comes the plumbing to hook up the C library.
# Register callback types in common container.
@@ -3596,6 +3651,13 @@ def cursor(self):
("clang_codeCompleteGetNumDiagnostics", [CodeCompletionResults], c_int),
("clang_createIndex", [c_int, c_int], c_object_p),
("clang_createTranslationUnit", [Index, c_interop_string], c_object_p),
+ ("clang_CXRewriter_create", [TranslationUnit], c_object_p),
+ ("clang_CXRewriter_dispose", [Rewriter]),
+ ("clang_CXRewriter_insertTextBefore", [Rewriter, SourceLocation, c_interop_string]),
+ ("clang_CXRewriter_overwriteChangedFiles", [Rewriter], c_int),
+ ("clang_CXRewriter_removeText", [Rewriter, SourceRange]),
+ ("clang_CXRewriter_replaceText", [Rewriter, SourceRange, c_interop_string]),
+ ("clang_CXRewriter_writeMainFileToStdOut", [Rewriter]),
("clang_CXXConstructor_isConvertingConstructor", [Cursor], bool),
("clang_CXXConstructor_isCopyConstructor", [Cursor], bool),
("clang_CXXConstructor_isDefaultConstructor", [Cursor], bool),
diff --git a/clang/bindings/python/tests/cindex/test_rewrite.py b/clang/bindings/python/tests/cindex/test_rewrite.py
new file mode 100644
index 00000000000000..2c5f904ce50bdc
--- /dev/null
+++ b/clang/bindings/python/tests/cindex/test_rewrite.py
@@ -0,0 +1,75 @@
+import sys
+import io
+import unittest
+import tempfile
+
+from clang.cindex import (
+ Rewriter,
+ TranslationUnit,
+ Config,
+ File,
+ SourceLocation,
+ SourceRange,
+)
+
+
+class TestRewrite(unittest.TestCase):
+ code = """
+int test1;
+
+void test2(void);
+
+int f(int c) {
+ return c;
+}
+"""
+
+ def setUp(self):
+ self.tmp = tempfile.NamedTemporaryFile(suffix=".cpp", buffering=0)
+ self.tmp.write(TestRewrite.code.encode("utf-8"))
+ self.tmp.flush()
+ self.tu = TranslationUnit.from_source(self.tmp.name)
+ self.rew = Rewriter.create(self.tu)
+ self.file = File.from_name(self.tu, self.tmp.name)
+
+ def tearDown(self):
+ self.tmp.close()
+
+ def test_insert(self):
+ snip = "#include <cstdio>\n"
+
+ beginning = SourceLocation.from_offset(self.tu, self.file, 0)
+ self.rew.insertTextBefore(beginning, snip)
+ self.rew.overwriteChangedFiles()
+
+ with open(self.tmp.name, "r", encoding="utf-8") as f:
+ self.assertEqual(f.read(), snip + TestRewrite.code)
+
+ def test_replace(self):
+ pattern = "test2"
+ replacement = "func"
+
+ offset = TestRewrite.code.find(pattern)
+ pattern_range = SourceRange.from_locations(
+ SourceLocation.from_offset(self.tu, self.file, offset),
+ SourceLocation.from_offset(self.tu, self.file, offset + len(pattern)),
+ )
+ self.rew.replaceText(pattern_range, replacement)
+ self.rew.overwriteChangedFiles()
+
+ with open(self.tmp.name, "r", encoding="utf-8") as f:
+ self.assertEqual(f.read(), TestRewrite.code.replace(pattern, replacement))
+
+ def test_remove(self):
+ pattern = "int c"
+
+ offset = TestRewrite.code.find(pattern)
+ pattern_range = SourceRange.from_locations(
+ SourceLocation.from_offset(self.tu, self.file, offset),
+ SourceLocation.from_offset(self.tu, self.file, offset + len(pattern)),
+ )
+ self.rew.removeText(pattern_range)
+ self.rew.overwriteChangedFiles()
+
+ with open(self.tmp.name, "r", encoding="utf-8") as f:
+ self.assertEqual(f.read(), TestRewrite.code.replace(pattern, ""))
|
Pinging @AaronBallman and @jyknight for review. (sorry if I chose the wrong people) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The changes look reasonable to me as far as they go, but I don't have much expertise in Python. I added @Endilll to hopefully help give it a second set of eyes.
You should add a release note (clang/docs/ReleaseNotes.rst) about the new bindings so others know that they've been improved.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks good overall.
It's good that you wrote your own tests, but it would also be nice to mirror tests in clang/unittests/libclang/LibclangTest.cpp
which test the same API. This way we can identify issues in binding layer itself (when C++ tests pass, but Python tests don't, or vise versa)
I have reviewed the tests for libclang and it appears that there are already tests for the rewriter, which are more extensive than the ones I wrote. Therefore, I have decided to mirror the tests from libclang in the Python binding. Please let me know if this approach is appropriate. |
If you think that existing tests fully cover yours, then it's appropriate. Otherwise, we'd benefit from additional test coverage, but it's not necessarily in the scope of this PR. |
Both tests are just invoking the functionalities of the rewriter on a sample code. I think additional tests for libclang does not provide extra coverage (as it is just the front-end). Thus I mirrored the tests back to the Python binding in 7a4ebf9. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
PR description is going to become a commit message after merging. Let me know when it's ready.
Just updated the description a bit and I believe it's all set for merging. |
Hi all, sorry for jumping in late. Minor nit here. It looks like this change is introducing camelCase methods, but the majority of the existing Python file uses snake_case, even though the underlying libclang interface is camelCase. For consistency, can we make these interface functions snake_case or is there some overriding style guide that wants camelCase? |
@linux4life798 Thanks for pointing out! I've updated the naming to conform snake_case. |
Thanks! |
@linux4life798 Nice catch! I see that so far only |
No worries. Of course unrelated to this change, we might be able to add the snake_case equivalent binding and marking the old ones as depreciated. |
I don't think this needs to block this change, but another concern I have about this API is that it is writing to stdout from C++ intermixed with Python writing to its own buffered stdout. This means that the user may see inconsistent output. This is somewhat a deficiency of libclang Rewrite API, since there is only one specific function to "write the main file out" and that is through stdout. In an attempt to keep the world consistent, we may want to consider flushing the Python stdout buffer ( |
@linux4life798 Do you mind filing issues for the points you highlighted? Feel free to skip this step if you (or someone else) plan to submit PRs that address them. |
Added flush before calling Personally I didn't find this function particularly useful: I was hoping to perform rewriting without creating new files, but seems like capturing stdout from a shared library requires some complex operation (
llvm-project/clang/lib/Rewrite/Rewriter.cpp Line 180 in 179ba12
|
@linux4life798 Can you review the last update? |
I think that's reasonable.
I agree. This should probably be exposed to libclang (and then to this python binding). |
This change LGTM (I don't have reviewer rights). |
Ping. Can one of you press the merge button? |
Exposes
CXRewriter
API to the python binding asclass Rewriter
.