CS代考 Optimal Layout Synthesis for Quantum Computing

Optimal Layout Synthesis for Quantum Computing
Bochen Tan University of California, Los Angeles
Jason of California, Los Angeles
(QC) are inevitable. A classical algorithm is a series of operations on a set of bits; likewise, a quantum algorithm is a series of op- erations, i.e., quantum gates, on a set of quantum bits, i.e., qubits. A realistic quantum computer architecture only supports gates in its gate library. To implement a quantum algorithm, the original quantum gates have to be translated into gates within this library. This process is called logic synthesis for quantum computing. It has been shown that all quantum algorithms can be decomposed to quantum gates acting on a single qubit or two qubits [17], so usually the output of logic synthesis is a list of single-qubit gates and two-qubit gates.

Copyright By PowCoder代写 加微信 powcoder

After logic synthesis, the qubits and quantum gates are still purely logical. Next, a mapping from such logical qubits to the physical qubits on QC architecture and a schedule which decides when every quantum gate is executed have to be provided. (Note that our notion of logical qubit is di￿erent from that in the quantum error correction.) The task of deriving such information is called layout synthesis for quantum computing (LSQC) [24]. Other names have also been used to refer to this task,1e.g., quantum circuit compilation [1, 8], transpilation (in Qiskit ), placement [14, 20], mapping [6, 13, 15, 26, 29], qubit allocation [22], and routing [23]. We choose to use the term ‘layout synthesis’ as it involves both placement and scheduling. There is an important issue to resolve in LSQC. Every two-qubit gate requires a connection between its two qubits. An algorithm designer may assume the logical qubits to have all-to-all connectivity, but this is not the case for the physical qubits in many QC architectures, e.g., the superconducting QC systems at Google [3]. As a result, every QC architecture comes with a coupling graph specifying which physical qubit pairs are connected. If the pair of logical qubits of a two-qubit gate is mapped to a disconnected physical qubit pair on the coupling graph, this gate will not be executable. Luckily, there is a special type of gate, SWAP, that can exchange the mapping of two logical qubits. Thus, with a series of SWAP gates, two logical qubits can reach two connected physical qubits, but this comes at some additional cost. The decisions of when and where to apply these SWAP gates are made in LSQC.
A recent study [24] revealed large optimality gaps of several lead- ing LSQC tools with a set of benchmarks named QUEKO that have known optimal solutions. Therefore, despite the NP-completeness of the problem [14, 22, 24], optimal approaches to LSQC are still worthy of research because they provide the baseline of measur- ing the available LSQC solutions. There have been a few previous works on exact or optimal LSQC [6, 20, 22, 26, 28]. However, they either just focus on speci￿c coupling graphs, or has some implicit and unnecessary constraints. As for the heuristic approaches [8, 13– 15, 22, 23, 25, 29], the optimality gaps suggest that they still have a lot of room for improvement.
In this paper, we present two layout synthesizers and compare them with leading related works on a comprehensive set of bench- mark quantum programs and architectures. The optimal layout
1 h￿ps://qiskit.org/
Recent years have witnessed the fast development of quantum com- puting. Researchers around the world are eager to run larger and larger quantum algorithms that promise speedups impossible to any classical algorithm. However, the available quantum comput- ers are still volatile and error-prone. Thus, layout synthesis, which transforms quantum programs to meet these hardware limitations, is a crucial step in the realization of quantum computing. In this paper, we present two synthesizers, one optimal and one approxi- mate but nearly optimal. Although a few optimal approaches to this problem have been published, our optimal synthesizer explores a larger solution space, thus is optimal in a stronger sense. In addition, it reduces time and space complexity exponentially compared to some leading optimal approaches. The key to this success is a more e￿cient spacetime-based variable encoding of the layout synthe- sis problem as a mathematical programming problem. By slightly changing our formulation, we arrive at an approximate synthesizer that is even more e￿cient and outperforms some leading heuristic approaches, in terms of additional gate cost, by up to 100%, and also ￿delity by up to 10x on a comprehensive set of benchmark pro- grams and architectures. For a speci￿c family of quantum programs named QAOA, which is deemed to be a promising application for near-term quantum computers, we further adjust the approximate synthesizer by taking commutation into consideration, achieving up to 75% reduction in depth and up to 65% reduction in additional cost compared to the tool used in a leading QAOA study.
CCS Concepts: • Computer systems organization ! Quan- tum computing; • Hardware ! Physical synthesis.
Keywords: quantum computing, layout synthesis, allocation, place- ment, scheduling, mapping
ACM Reference Format:
Bochen Tan and . 2020. Optimal Layout Synthesis for Quantum Computing. In . ACM, , NY, USA, 10 pages.
1 Introduction
For years, quantum algorithms have been shown theoretically to hold signi￿cant advantages over classical algorithms on some alge- braic problems like factoring large numbers [21], quantum chem- istry [5], machine learning [7], and so on. Recently, the ￿rst ex- perimental proof of quantum computational advantage on certain sampling problems was published [3].
To implement a quantum algorithm on an actual quantum com- puter, logic synthesis and layout synthesis for quantum computing
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for pro￿t or commercial advantage and that copies bear this notice and the full citation on the ￿rst page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior speci￿c permission and/or a fee. Request permissions from
ICCAD ’20, ,
© 2020 Association for Computing Machinery.
arXiv:2007.15671v1 [cs.AR] 30 Jul 2020

ICCAD ’20, ,
Bochen Tan and
synthesizer for quantum computing (OLSQ) formulates LSQC as a satis￿ability modulo theories optimization problem (SMT). This problem is then passed to Z3 SMT solver [9] which guarantees optimal solutions. Compared to Wille et al. [26], a leading opti- mal approach which also formulates LSQC to SMT, we provide exponential reduction on the number of variables, resulting in orders-of-magnitude reductions in runtime and the memory usage. The key to this improvement is to represent the solution space with mapping variables, not with mapping transformation variables as in [22, 26]. Also, by relaxing the gate placement under the depen- dency constraints, OLSQ explores a larger solution space than other gate-by-gate [22, 26] and level-by-level [6] “optimal” approaches, so its results are sometimes even better. Furthermore, by eliminating redundant mapping variables between transitions, i.e., when the mapping changes due to some SWAP gates, we signi￿cantly reduce the number of variables and derive a highly scalable approximate synthesizer, transition-based-OLSQ (TB-OLSQ). Evaluation results show that TB-OLSQ is nearly optimal and is able to increase ￿delity by 1.30x in geomean, compared to TriQ [15], a leading academic work, and reduce cost by 69.2% in geomean, compared to t|keti [23], a leading industry work. Finally, given that quantum approximate optimization (QAOA) [4, 10, 12, 18] is very much in the spotlight due to its feasibility for near-term quantum computers, we study the LSQC of QAOA. The structure of gates in QAOA makes it an excellent showcase for the power of our transition-based model. Moreover, with additional prior knowledge on commutation re- lations in QAOA, we improve TB-OLSQ for QAOA, denoted as QAOA-OLSQ. Compared to t|keti, which was utilized in a leading QAOA experiment study [4], QAOA-OLSQ is able to reduce depth by 70.2% in geomean and cost by 53.8% in geomean.
2 Background
The input to an LSQC problem consists of two parts. The ￿rst part is a quantum program or a quantum circuit which is speci￿ed as a list of quantum gates or operations 01…L1 on logical qubits q0,q1,…,qM1.(Thereare,intotal,LgatesandM logicalqubits.) Because we assume that logic synthesis is performed prior to LSQC, all of these gates are either single-qubit or two-qubit gates and must be in the gate library of the architecture. An important single-qubit gates is X gate, which negates the qubit; an important two-qubit gate is CX gate, which does not modify the ￿rst qubit but changes the second qubit to the XOR of the ￿rst and second qubits. Until further notice, we use the Qiskit gate library, which contains X and CX mentioned above, and also other single-qubit gates such as H, T, T†, and S. The speci￿c functions of these gates do not concern LSQC; in fact, we only care about on which qubit(s) the gates operate.
For example, the beginning of a quantum adder program intro- duced in [2] is shown below.
OPENQASM 2.0; // code for quantum adder
include qelib1.inc; // gate library
q0 X T••T†• g0 g4 g14
g1 g5 g8 g12 g15 g18
q2 •T• •T†• g6 g11 g16
(a) IBM QX2
g3 g7 g9 g10 g13 g17 g19 g20 g21 g22 Figure 1. Circuit diagram for quantum adder
qreg q[4];
cx q[2], q[3];
// declare 4 logical qubits
(b) Google Sycamore (part of)
(c) Rigetti Aspen-4
Figure 2. Coupling Graphs
In the above program, 0 is an X gate on logical qubit q0; 3 is a CX gate, on q2 and q3. One way to visualize a quantum program is a circuit diagram, e.g., Figure 1, where each wire stands for a logical qubit, and the gates are placed from left to right on the wires according to the program order. To make a distinction between single-qubit and two-qubit gates, we separate them into two lists G1 and G2. For l 2 G1, we use l .q to denote the logical qubit it operates on; for l 2 G2, we use l .q and l .q0. For example, 0.q = q0, 5.q = q1, 3.q = q2, 3.q0 = q3.
The second part of the input to LSQC problem is a coupling graph (P,E),whereP ={p0,p1,…,pN1}isthesetofphysicalqubitsand E = {e0,e1,…,eK1}isthesetof(undirected)connectionsbetween them. (There are, in total, N physical qubits and K edges.) The coupling graphs used in this paper are shown in Figure 2. We denote anedgeas(ek.p,ek.p0),e.g.,inFigure2a,e1.p =p0 ande1.p0 =p2. In addition, ￿delity information can be provided as three functions: f0 : P ! [0, 1] for measurements, f1 : P ! [0, 1] for single-qubit gates, and f2 : E ! [0, 1] for two-qubit gates. (Under current technology constraints, quantum computer providers usually only o￿er one type of entangling two-qubit gate. We are referring the ￿delity for this type of gate.)
The output of LSQC is the spacetime coordinates (tl , xl ) for all the input gates and the SWAP gates inserted, and a ￿nal mapping : Q ! P providing which physical qubit to measure for each logical qubit. We show the LSQC result of the quantum adder (Figure 1) on IBM QX2 (Figure 2a) in Figure 3. In this diagram, gates are aligned according to their time coordinates, e.g., t0 = t1 = t2 = 0 and t10 = 8; the space coordinates of single-qubit gates can be read o￿ from the mapping displayed above the wires, e.g., q0 is mapped to p3 at time 0, so x0 = p3; the space coordinates of two-qubit gates can be deduced from the mapping, e.g., at time 8, q3 and q0 are mapped correspondingly to p1 and p2, so x10 = e2 because e2
(d) IBM Melbourne

Optimal Layout Synthesis for ￿antum Computing
ICCAD ’20, ,
q0 p3 X T • g0 g4
g10 g13 g17 g19 g20 g21 g22 8 9 10 11 12 13 14
the product of ￿delity of all the single-qubit gates, two-qubit gates, and measurements. The current quantum computers are error- prone, so maximal ￿delity is desired.
3 Related Works
Exact Solutions: There have been multiple previous exact or optimal approaches related to LSQC. Shafaei et al. [19] ￿rst ￿nds a qubit mapping on a 2D grid that minimizes expected distances between qubit pairs required by two-qubit gates. Then, if some two-qubit gate still acts on non-adjacent qubits, the xy routing algorithm is applied to move them together. Wille et al. [26, 28] use pseudo- Boolean optimizer and SMT solver to minimize additional gate cost; Siraichi et al. [22] provide a dynamic programming approach to this task. They split the quantum program into individual gates and consider changes to the mapping at every interval. Instead of individual gates, Bhattacharjee et al. [6] splits the program into ‘levels’ of gates that can be executed in parallel. Afterwards, it optimizes depth with integer linear programming to decide the initial mapping and adjustments of mapping between the levels.
We ￿nd that either the gate-by-gate or the level-by-level arrange- ment imposes some additional constraints than arranging the gates with dependency. We shall illustrate this point with the aforemen- tioned LSQC instance. In this case, Wille et al. return a solution with two SWAP gates, but one SWAP would be enough as shown in Figure 3. This is because when arranging gate-by-gate, there is an implicit dependency between each two-qubit gate and the one after it, which are shown as blue and brown arrows in Figure 4. Speci￿cally, there is a dependency from 10 to 11, which means that in their result t10 < t11, but in the optimal solution, Figure 3, t10 > t11. In fact, When we manually impose an additional depen- dency from 10 to 11 onto OLSQ-SWAP (to be speci￿ed in the next section), it gives a solution with two SWAP gates as well. In essence, the gate list is only one of the topological orderings of the depen- dency graph like Figure 4. If we use a gate-by-gate arrangement with the input gate list order, the result certainly respects all the dependencies, but there are some additional arbitrarities that may lead to sub-optimality. OLSQ actually reconstructs the dependency graph from the gate list and thus e￿ectively explores all topological orderings. As for arranging level-by-level, in the LSQC instance above, suppose we assign 8 and 9 as the ￿rst level, 10 and 11 as the second level, 12 and 13 as the third level. Then there must be at least one SWAP inserted between level one and two; otherwise there exists a mapping that can satisfy all the two-qubit connections required by 8 , 9 , 10 , and 11 . This is to say the coupling graph contains a square-like structure, which is impossible in Figure 2a. Similarly, there must be at least one SWAP inserted between level two and three. Thus, at least two SWAP gates are required, which is sub-optimal compared to Figure 3.
Heuristic Solutions: Maslov et al. [14] publish one of the earliest work in LSQC and propose a heuristic algorithm that recursively cuts the coupling graph to derive SWAP gates to adjust the mapping. Siraichi et al. [22] ￿nd the initial mapping by matching out-degrees and search for SWAP gates to apply based on a heuristic distance between qubits. Zulehner et al. [29] partition the program and employs A* search with a lookahead scheme to ￿nd the SWAP gates that transform the current mapping to the next. Li et al. [13] apply bi- directional search to minimize additional gate cost. Childs et al. [8] extend existing approaches to the token swapping problem in order
q2p0•T• p0•T†•
q1 p2 X T g1 g5
g12 g15 g18
g2 g3 g7 g9
0 1 2 3 4 5 6 7
Figure 3. Result of LSQC for quantum adder 04 14
8 12 18 15 15
Figure 4. Immediate dependencies in the quantum adder (Red arrows are used in OLSQ. Green arrows are those imposed implicitly by the approach of Wille et al.’s. Brown means intersection.)
connects p1 and p2. The mapping remains the same as the previous time slot if it is not displayed, e.g., q1 is still mapped to p2 at time 1, so x5 = p2. Thus, the ￿nal mapping is just the last mapping displayed, (q0) = p2, (q1) = p3, (q2) = p0, and (q3) = p1. A SWAPgateconsistingofthreeCXgatesisinsertedone3between p2 and p3. We use the last time slot each SWAP gate takes as its time coordinate (in this case, 7).
The inserted SWAP gate in this example is absolutely necessary. The quantum program has two-qubit gates between q0 and q1 (8, 12, and 18), q1 and q2 (11), q2 and q3 (3, 9, 13, and 19), and ￿nally q3 and q0 (10 and 21). This means, without any SWAP gates, the logical qubits must be mapped to a set of physical qubits connected like a square. However, the coupling graph, Figure 2a, does not contain such a structure.
We stress a few constraints for LSQC. 1) We assume any kind of gate cancellation/optimization has been performed prior to LSQC, so all the input gates should be executed. 2) The two-qubit gates should be mapped to valid edges on the coupling graph. 3) Avoid collisions: if two gates l and l0 act on the same qubit, then they cannot be executed at the same time, i.e., tl , tl0. For example, t5 , t8 because both 5 and 8 act on q1. 4) Respect dependencies: if we have a dependency between l and l 0 , l < l 0, then their relative order should not change, i.e., tl < tl 0 . If there are no gates between these two gates, it is called an immediate dependency. However, deriving the dependencies in a generic quantum program turns out to be a nontrivial task. By default, we treat all the collision gate pairs above as dependencies. This is a common practice, e.g., also in Murali et al. [15], and also guarantees correctness. Immediate dependencies of the quantum adder is shown as red and brown arrows in Figure 4. With more knowledge on dependencies, we can improve the LSQC solutions, as we shall see in Section 5.3. There can be multiple objectives for LSQC: 1) Depth: the maximal time coordinate of all gates. Due to the limitation of current QC technology, physical qubits can only function well up to a short ‘lifetime’. Minimizing depth is thus very important. 2) SWAP Cost or gate count: the total number of additional SWAP gates. 3) Fidelity: ICCAD ’20, , Bochen Tan and for SWAP insertion. Tannu&Qureshi [25] observe the variation in gate ￿delity and thus propose taking ￿delity into the LSQC problem. Murali et al. [15] present an end-to-end LSQC ￿ow for ￿delity optimization. An SMT solver is applied to derive the initial mapping. For SWAP insertion, they construct a reliability matrix of two-qubit gates, which considers the SWAP gates required along the way, and use this matrix to route qubits on the coupling graph. In t|keti [23], when some two-qubit gate is not executable, the algorithm keeps the relevant qubit permutations as candidates and treats the gates ahead to evaluate them. If more than one candidate receives the highest score before the next non-executable gate, the process goes into a new level until one candidate wins. Recently, Alam et al. [1] design a compilation tool speci￿cally for QAOA circuits. They ￿rst assign gates to layers, but due to the commutation relations, most of the dependencies can be ignored, so they apply a two-level search: at the higher level, the order of the layers is decided, followed by a lower level breadth-￿rst search to ￿nd the SWAP gates. Benchmarks: To evaluate these approaches to LSQC, benchmarks such as reversible functions in RevLib [27], gate optimization results of useful logic functions in Amy et al. 程序代写 CS代考 加微信: powcoder QQ: 1823890830 Email: powcoder@163.com