-
Notifications
You must be signed in to change notification settings - Fork 2.5k
[WIP] Add new multithreaded TwoQubitPeepholeOptimization pass #13419
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
[WIP] Add new multithreaded TwoQubitPeepholeOptimization pass #13419
Conversation
Pull Request Test Coverage Report for Build 14525825123Warning: This coverage report may be inaccurate.This pull request's base commit is no longer the HEAD commit of its target branch. This means it includes changes from outside the original pull request, including, potentially, unrelated coverage changes.
Details
💛 - Coveralls |
ad06d1a
to
4d160bc
Compare
This commit adds a new transpiler pass for physical optimization, TwoQubitPeepholeOptimization. This replaces the use of Collect2qBlocks, ConsolidateBlocks, and UnitarySynthesis in the optimization stage for a default pass manager setup. The pass logically works the same way where it analyzes the dag to get a list of 2q runs, calculates the matrix of each run, and then synthesizes the matrix and substitutes it inplace. The distinction this pass makes though is it does this all in a single pass and also parallelizes the matrix calculation and synthesis steps because there is no data dependency there. This new pass is not meant to fully replace the Collect2qBlocks, ConsolidateBlocks, or UnitarySynthesis passes as those also run in contexts where we don't have a physical circuit. This is meant instead to replace their usage in the optimization stage only. Accordingly this new pass also changes the logic on how we select the synthesis to use and when to make a substituion. Previously this logic was primarily done via the ConsolidateBlocks pass by only consolidating to a UnitaryGate if the number of basis gates needed based on the weyl chamber coordinates was less than the number of 2q gates in the block (see Qiskit#11659 for discussion on this). Since this new pass skips the explicit consolidation stage we go ahead and try all the available synthesizers Right now this commit has a number of limitations, the largest are: - Only supports the target - It doesn't support any synthesizers besides the TwoQubitBasisDecomposer, because it's the only one in rust currently. For plugin handling I left the logic as running the three pass series, but I'm not sure this is the behavior we want. We could say keep the synthesis plugins for `UnitarySynthesis` only and then rely on our built-in methods for physical optimiztion only. But this also seems less than ideal because the plugin mechanism is how we support synthesizing to custom basis gates, and also more advanced approximate synthesis methods. Both of those are things we need to do as part of the synthesis here. Additionally, this is currently missing tests and documentation and while running it manually "works" as in it returns a circuit that looks valid, I've not done any validation yet. This also likely will need several rounds of performance optimization and tuning. t this point this is just a rough proof of concept and will need a lof refinement along with larger changes to Qiskit's rust code before this is ready to merge. Fixes Qiskit#12007 Fixes Qiskit#11659
Since Qiskit#13139 merged we have another two qubit decomposer available to run in rust, the TwoQubitControlledUDecomposer. This commit updates the new TwoQubitPeepholeOptimization to call this decomposer if the target supports appropriate 2q gates.
Clippy is correctly warning that the size difference between the two decomposer types in the TwoQubitDecomposer enumese two types is large. TwoQubitBasisDecomposer is 1640 bytes and TwoQubitControlledUDecomposer is only 24 bytes. This means each element of ControlledU is wasting > 1600 bytes. However, in this case that is acceptable in order to avoid a layer of pointer indirection as these are stored temporarily in a vec inside a thread to decompose a unitary. A trait would be more natural for this to define a common interface between all the two qubit decomposers but since we keep them instantiated for each edge in a Vec they need to be sized and doing something like `Box<dyn TwoQubitDecomposer>` (assuming a trait `TwoQubitDecomposer` instead of a enum) to get around this would have additional runtime overhead. This is also considering that TwoQubitControlledUDecomposer has far less likelihood in practice as it only works with some targets that have RZZ, RXX, RYY, or RZX gates on an edge which is less common.
Also don't run scoring more than needed.
Copy here the comment of @t-imamichi #13568 (comment)
We should make sure that after PR #13568 and this PR will be merged, we can efficiently transpile circuits into basis fractional RZZ gates . |
I added support for using the |
The priority for the two qubit peephole pass should be decreasing the 2q gate count. The error rate heuristic should only matter if the 2q counts are the same. This commit flips the heuristic to first check the 2q gate count so the first priority is reducing the 2q gate count.
@@ -0,0 +1,102 @@ | |||
# This code is part of Qiskit. | |||
# | |||
# (C) Copyright IBM 2017, 2024. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
# (C) Copyright IBM 2017, 2024. | |
# (C) Copyright IBM 2017, 2025. |
Seems like the copyright years have not often been updated in source files. But since this is a new file, and a new year...
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I did write the pass originally in 2024. I've been working on this on and off since November.
@@ -0,0 +1,533 @@ | |||
// This code is part of Qiskit. | |||
// | |||
// (C) Copyright IBM 2024 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
// (C) Copyright IBM 2024 | |
// (C) Copyright IBM 2025 |
Copy here this #13428 (comment) - is this example being solved by this PR? One case that has not been solved yet is the following:
In this case, the circuit would not get consolidated and re-synthesized into a single RZZ gate.
outputs:
|
Is it possible to try to synthesize both U and U^(-1) to check which one produces a better synthesis?
produces the circuit:
while its inverse:
produces the circuit:
|
This commit removes the unitary synthesis plugin mechanism from the pass. This was a layer violation to support this when the pass logic doesn't actually support using the plugin interface. It is easier and more clear that if the plugin interface usage is desired to handle that in the pass manager construction rather than have this pass internally build a pass manager and execute other passes to emulate behavior it doesn't have.
There were two issues identified by the testing which required fixing and adjusting the tests based on limitations in the pass. The first issue was the parameters for the target gate was not handled correctly. In the case of using the Controlled U decomposer we were not passing the computed parameter value correctly to the output circuit and instead the ParameterExpression from the target was being used. Then in the case of controlled gates (not supercontrolled) that had a fixed angle that are normally intended for the xx decomposer were incorrectly being passed to the TwoQubitBasisDecomposer which can't work with them. This was resulting in invalid circuit outputs. The use of the TwoQubitBasisDecomposer is now correctly filtering to only be run with supercontrolled gates. The tests were adjusted for this limitation because they were mostly copied from the UnitarySynthesis tests which supports xx decomposer.
UGate, | ||
ZGate, | ||
RYYGate, | ||
RZZGate, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
According to the error log, RZZGate is not used.
RZZGate, |
I've suggested some further tests here: mtreinish#31 |
The typing for some of the new methods on the DAGCircuitBuilder where a bit too strict and requried the caller to do more work than was necessary. This commit loosens the typing to make it a bit more ergonomic and straightforward to use. It also more closely matches the DAGCircuit methods the builder struct is mirroring. Right now the only signature difference is qubits and clbits are wrapped in an Option while on DAGCircuit it's not. This commit doesn't change this difference, although there really isn't a reason to make this distinction and both methods could have the same signature.
This commit updates some of the pass's logic and handles some bugs that were found during development. It also adds a couple new test cases, one of which is failing. There still seem to be bugs in the case we need to use a reverse edge in the target.
Summary
This commit adds a new transpiler pass for physical optimization,
TwoQubitPeepholeOptimization. This replaces the use of Collect2qBlocks,
ConsolidateBlocks, and UnitarySynthesis in the optimization stage for
a default pass manager setup. The pass logically works the same way
where it analyzes the dag to get a list of 2q runs, calculates the matrix
of each run, and then synthesizes the matrix and substitutes it inplace.
The distinction this pass makes though is it does this all in a single
pass and also parallelizes the matrix calculation and synthesis steps
because there is no data dependency there.
This new pass is not meant to fully replace the Collect2qBlocks,
ConsolidateBlocks, or UnitarySynthesis passes as those also run in
contexts where we don't have a physical circuit. This is meant instead
to replace their usage in the optimization stage only. Accordingly this
new pass also changes the logic on how we select the synthesis to use
and when to make a substitution. Previously this logic was primarily done
via the ConsolidateBlocks pass by only consolidating to a UnitaryGate if
the number of basis gates needed based on the weyl chamber coordinates
was less than the number of 2q gates in the block (see #11659 for
discussion on this). Since this new pass skips the explicit
consolidation stage we go ahead and try all the available synthesizers
Right now this commit has a number of limitations, the largest are:
TwoQubitBasisDecomposer
andTwoQubitControlledUDecomposer
are used)This pass doesn't support using the unitary synthesis plugin interface, since
it's optimized to use Qiskit's built-in two qubit synthesis routines written in
Rust. The existing combination of
ConsolidateBlocks
andUnitarySynthesis
should be used instead if the plugin interface is necessary.
Details and comments
Fixes #12007
Fixes #11659
TODO: