Skip to content

Introduce custom sympy srepr parser [stable/1.4] #14022

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 12 commits into from
Mar 14, 2025

Conversation

mtreinish
Copy link
Member

Summary

This commit introduces a custom parser to QPY for parameter expression payloads that were generated using sympy. Prior to QPY version 10 this was the only way we supported serializing parameter expressions in QPY. For QPY version 10, 11, and 12 sympy could optionally be used if the payload was generated explicitly to not use symengine (in qiskit 1.0 it defaulted to use symengine).
This serialization format relied on sympy to generate a string representation of the expression which we then put in the payload. On deserialization we called sympy's parse_expr() function which internally is calling sympy's sympify() internally. Sympy documents that sympify() relies on Python's eval() for string input and should not be used with untrusted input. But parse_expr() didn't have such a warning (at the time, I plan to contribute adding one), so using this function provides an avenue for arbitrary code execution during QPY deserialization.

This commit fixes this issue by writing a custom parser for the string repesentation in a QPY payload based on python's ast module. This parser walks the abstract syntax tree and builds the sympy expression object as it it goes. It is restricted to the operations that ParameterExpression supports and if any part of the string tries to use functionality outside that set it will error.

For ScheduleBlock objects which also serialize a symbolic expression there is no new parser to securely load the payload. This is because the data model of ScheduleBlock enables using any sympy expression and the limited parser is not sufficient to represent the schedule block. Since this functionality is deprecated and will be removed in Qiskit 2.0.0, potentially vulnerable payloads are rejected by default now. If the input is trusted by the user they can explicitly opt-in to using it by setting a new flag to trust the input. This is technically a breaking API change, but since the payload is potentially vulnerable this is a necessary change to secure QPY.

Details and comments

mtreinish and others added 12 commits February 27, 2025 20:00
This commit introduces a custom parser to QPY for parameter expression
payloads that were generated using sympy. Prior to QPY version 10 this
was the only way we supported serializing parameter expressions in QPY.
For QPY version 10, 11, and 12 sympy could optionally be used if the
payload was generated explicitly to not use symengine (in qiskit 1.0 it
defaulted to use symengine).
This serialization format relied on sympy to generate a string
representation of the expression which we then put in the payload. On
deserialization we called sympy's `parse_expr()` function which
internally is calling sympy's `sympify()` internally. Sympy documents
that `sympify()` relies on Python's `eval()` for string input and
should not be used with untrusted input. But `parse_expr()` didn't have
such a warning (at the time, I plan to contribute adding one), so
using this function provides an avenue for arbitrary code execution
during QPY deserialization.

This commit fixes this issue by writing a custom parser for the string
repesentation in a QPY payload based on python's ast module. This parser
walks the abstract syntax tree and builds the sympy expression object as
it it goes. It is restricted to the operations that
`ParameterExpression` supports and if any part of the string tries to
use functionality outside that set it will error.

For `ScheduleBlock` objects which also serialize a symbolic expression
there is no new parser to securely load the payload. This is because
the data model of `ScheduleBlock` enables using any `sympy` expression
and the limited parser is not sufficient to represent the schedule
block. Since this functionality is deprecated and will be removed in
Qiskit 2.0.0, potentially vulnerable payloads are rejected by default
now. If the input is trusted by the user they can explicitly opt-in to
using it by setting a new flag to trust the input. This is technically a
breaking API change, but since the payload is potentially vulnerable
this is a necessary change to secure QPY.
The calibrations path was apparently never using symengine encoding for
pulse schedules even if it was available and set. This is a bug but fixing
it prevents us from loading historical payloads. Since this is
pre-existing this restores the previous behavior. We can fix the
underlying bug in an independent PR for a 1.4.x release that uses the
qiskit version as a gate on how to interpret the symbolic encoding flag.
This commit looks for all the calls to sympy that internally call eval()
and either does a runtime check to ensure we're not passing user defined
strings to those functions or rewrites the functions to not rely on
those sympy functions.
Previously the exception raised in the pulse path was over eagerly
raising for any pulse payload using sympy encoding, whether it contained
a symbolic pulse or not. This commit fixes this oversight so that the
exception is only raised when we go to use sympy's deserialization.
Tests are added to ensure we still raise with a symbolic pulse but not
if the schedule block doesn't contain any symbolic expressions.
Co-authored-by: Elena Peña Tapia <[email protected]>
@mtreinish mtreinish added priority: high Changelog: Bugfix Include in the "Fixed" section of the changelog labels Mar 14, 2025
@mtreinish mtreinish added this to the 1.4.2 milestone Mar 14, 2025
@mtreinish mtreinish requested a review from a team as a code owner March 14, 2025 13:03
@qiskit-bot
Copy link
Collaborator

One or more of the following people are relevant to this code:

Copy link
Contributor

@ElePT ElePT left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, I think that it should be ready to merge if it passes CI.

@mtreinish mtreinish merged commit 3da3d5a into Qiskit:stable/1.4 Mar 14, 2025
17 checks passed
@mtreinish mtreinish deleted the advisory-fix-stable-1.4 branch March 14, 2025 18:14
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Changelog: Bugfix Include in the "Fixed" section of the changelog priority: high
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants