-
Notifications
You must be signed in to change notification settings - Fork 2.5k
Introduce custom sympy srepr parser [stable/1.4] #14022
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Merged
mtreinish
merged 12 commits into
Qiskit:stable/1.4
from
mtreinish:advisory-fix-stable-1.4
Mar 14, 2025
Merged
Introduce custom sympy srepr parser [stable/1.4] #14022
mtreinish
merged 12 commits into
Qiskit:stable/1.4
from
mtreinish:advisory-fix-stable-1.4
Mar 14, 2025
Conversation
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This commit introduces a custom parser to QPY for parameter expression payloads that were generated using sympy. Prior to QPY version 10 this was the only way we supported serializing parameter expressions in QPY. For QPY version 10, 11, and 12 sympy could optionally be used if the payload was generated explicitly to not use symengine (in qiskit 1.0 it defaulted to use symengine). This serialization format relied on sympy to generate a string representation of the expression which we then put in the payload. On deserialization we called sympy's `parse_expr()` function which internally is calling sympy's `sympify()` internally. Sympy documents that `sympify()` relies on Python's `eval()` for string input and should not be used with untrusted input. But `parse_expr()` didn't have such a warning (at the time, I plan to contribute adding one), so using this function provides an avenue for arbitrary code execution during QPY deserialization. This commit fixes this issue by writing a custom parser for the string repesentation in a QPY payload based on python's ast module. This parser walks the abstract syntax tree and builds the sympy expression object as it it goes. It is restricted to the operations that `ParameterExpression` supports and if any part of the string tries to use functionality outside that set it will error. For `ScheduleBlock` objects which also serialize a symbolic expression there is no new parser to securely load the payload. This is because the data model of `ScheduleBlock` enables using any `sympy` expression and the limited parser is not sufficient to represent the schedule block. Since this functionality is deprecated and will be removed in Qiskit 2.0.0, potentially vulnerable payloads are rejected by default now. If the input is trusted by the user they can explicitly opt-in to using it by setting a new flag to trust the input. This is technically a breaking API change, but since the payload is potentially vulnerable this is a necessary change to secure QPY.
The calibrations path was apparently never using symengine encoding for pulse schedules even if it was available and set. This is a bug but fixing it prevents us from loading historical payloads. Since this is pre-existing this restores the previous behavior. We can fix the underlying bug in an independent PR for a 1.4.x release that uses the qiskit version as a gate on how to interpret the symbolic encoding flag.
Co-authored-by: Jake Lishman <[email protected]>
This commit looks for all the calls to sympy that internally call eval() and either does a runtime check to ensure we're not passing user defined strings to those functions or rewrites the functions to not rely on those sympy functions.
Previously the exception raised in the pulse path was over eagerly raising for any pulse payload using sympy encoding, whether it contained a symbolic pulse or not. This commit fixes this oversight so that the exception is only raised when we go to use sympy's deserialization. Tests are added to ensure we still raise with a symbolic pulse but not if the schedule block doesn't contain any symbolic expressions.
Co-authored-by: Elena Peña Tapia <[email protected]>
One or more of the following people are relevant to this code:
|
ElePT
approved these changes
Mar 14, 2025
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM, I think that it should be ready to merge if it passes CI.
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Labels
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Summary
This commit introduces a custom parser to QPY for parameter expression payloads that were generated using sympy. Prior to QPY version 10 this was the only way we supported serializing parameter expressions in QPY. For QPY version 10, 11, and 12 sympy could optionally be used if the payload was generated explicitly to not use symengine (in qiskit 1.0 it defaulted to use symengine).
This serialization format relied on sympy to generate a string representation of the expression which we then put in the payload. On deserialization we called sympy's parse_expr() function which internally is calling sympy's sympify() internally. Sympy documents that sympify() relies on Python's eval() for string input and should not be used with untrusted input. But parse_expr() didn't have such a warning (at the time, I plan to contribute adding one), so using this function provides an avenue for arbitrary code execution during QPY deserialization.
This commit fixes this issue by writing a custom parser for the string repesentation in a QPY payload based on python's ast module. This parser walks the abstract syntax tree and builds the sympy expression object as it it goes. It is restricted to the operations that ParameterExpression supports and if any part of the string tries to use functionality outside that set it will error.
For ScheduleBlock objects which also serialize a symbolic expression there is no new parser to securely load the payload. This is because the data model of ScheduleBlock enables using any sympy expression and the limited parser is not sufficient to represent the schedule block. Since this functionality is deprecated and will be removed in Qiskit 2.0.0, potentially vulnerable payloads are rejected by default now. If the input is trusted by the user they can explicitly opt-in to using it by setting a new flag to trust the input. This is technically a breaking API change, but since the payload is potentially vulnerable this is a necessary change to secure QPY.
Details and comments