You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Perform BarrierBeforeFinalMeasurements analysis in parallel (#13411)
* Use OnceLock instead of OnceCell
OnceLock is a thread-safe version of OnceCell that enables us to use
PackedInstruction from a threaded environment. There is some overhead
associated with this, primarily in memory as the OnceLock is a larger
type than a OnceCell. But the tradeoff is worth it to start leverage
multithreading for circuits.
Fixes#13219
* Update twirling too
* Perform BarrierBeforeFinalMeasurements analysis in paralle
With #13410 removing the non-threadsafe structure from our circuit
representation we're now able to read and iterate over a DAGCircuit from
multiple threads. This commit is the first small piece doing this, it
moves the analysis portion of the BarrierBeforeFinalMeasurements pass to
execure in parallel. The pass checks every node to ensure all it's
decendents are either a measure or a barrier before reaching the end of
the circuit. This commit iterates over all the nodes and does the check
in parallel.
* Remove allocation for node scan
* Refactor pass to optimize search and set parallel threshold
This commit updates the logic in the pass to simplify the search
algorithm and improve it's overall efficiency. Previously the pass would
search the entire dag for all barrier and measurements and then did a
BFS from each found node to check that all descendants are either
barriers or measurements. Then with the set of nodes matching that
condition a full topological sort of the dag was run, then the
topologically ordered nodes were filtered for the matching set. That
sorted set is then used for filtering
This commit refactors this to do a reverse search from the output
nodes which reduces the complexity of the algorithm. This new algorithm
is also conducive for parallel execution because it does a search
starting from each qubit's output node. Doing a test with a quantum
volume circuit from 10 to 1000 qubits which scales linearly in depth
and number of qubits a crossover point between the parallel and serial
implementations was found around 150 qubits.
* Update crates/circuit/src/dag_circuit.rs
Co-authored-by: Raynel Sanchez <[email protected]>
* Rework logic to check using StandardInstruction
* Add comments explaining the search function
* Update crates/circuit/src/dag_circuit.rs
Co-authored-by: Raynel Sanchez <[email protected]>
---------
Co-authored-by: Raynel Sanchez <[email protected]>
0 commit comments