CS代考 CA 00000001 00000010

c 2020 IEEE. Personal use of this material is permitted. Permission from IEEE must be obtained for all other uses, in any current or future media, including reprinting/republishing this material for advertising or promotional purposes, creating new collective works, for resale or redistribution to servers or lists, or reuse of any copyrighted component of this work in other works.
OPTIMALITY STUDY OF EXISTING QUANTUM COMPUTING LAYOUT SYNTHESIS TOOLS
Bochen S, UCLA CS, UCLA

Copyright By PowCoder代写 加微信 powcoder

July 14, 2020
Layout synthesis, an important step in quantum computing, processes quantum circuits to satisfy device layout constraints. In this paper, we construct QUEKO benchmarks for this problem, which have known optimal depths and gate counts. We use QUEKO to evaluate the optimality of current layout synthesis tools, including Cirq from Google, Qiskit from IBM, t|keti from Cambridge Quantum Computing, and recent academic work. To our surprise, despite over a decade of research and development by academia and industry on compilation and synthesis for quantum circuits, we are still able to demonstrate large optimality gaps: 1.5-12x on average on a smaller device and 5-45x on average on a larger device. This suggests substantial room for improvement of the efficiency of quantum computer by better layout synthesis tools. Finally, we also prove the NP-completeness of the layout synthesis problem for quantum computing. We have made the QUEKO benchmarks open-source.
1 Introduction
Recently, a quantum processor “Sycamore” from Google was shown to have a clear advantage over classical super- computers on a problem named sampling random quantum circuit [1]. It is widely expected that in the near future, quantum computing (QC) will outperform its classical counterpart solving more and more problems. Achieving this computational advantage, however, requires executing larger and larger quantum circuits.
A quantum circuit consists of quantum gates acting on qubits. It was shown that only gates acting on one or two qubits are required for universal quantum computing [2]. After a quantum circuit is designed, it needs to be mapped to a QC device. However, qubit connections required by two-qubit gates are often greatly constrained by QC device layouts. QC layout synthesis resolves this issue by producing an initial mapping from the qubits in the circuit to the physical qubits on the QC device, adjusting the mapping to legalize two-qubit gates by inserting some new gates, and scheduling all the gates. The resulting circuit preserves the original functionality and is executable on the QC device.
Quantum circuits in earlier experiments used to be only dozens of gates on small devices, e.g., those with 5 or fewer qubits. In those cases, layout synthesis was usually realized by exhaustive enumeration. However, the task is increasingly intractable as the circuits get deeper and wider. Nowadays, a cutting-edge QC experiment requires the execution of a circuit of 53 qubits, 1113 single-qubit gates, and 430 two-qubit gates [1]. For a general circuit of this size, the number of possible initial mappings is 53!, and the subsequent scheduling and legalization steps have large solution space as well. Clearly, design automation is necessary. In addition, the size of the circuits that QC hardware is able to execute
arXiv:2002.09783v4 [quant-ph] 13 Jul 2020

has been scaling exponentially in the past few years [3]. The fast increase of hardware capacity presents an even bigger challenge to layout synthesis.
Several layout synthesis tools are available and there are also benchmarks that help us to compare them. However, it is currently unknown how far these tools are away from the optimal solutions. In this paper, we present QUEKO benchmarks (quantum mapping examples with known optimal) which are quantum circuits with known optimal depth for the given QC device. Then, we evaluate four existing QC layout synthesis tools with QUEKO, namely t|keti1, greedy router included in Cirq2 , DenseLayout plus StochasticSwap included in Qiskit3 , and the recent academic work by Zulehner et al. [4]. To our surprise, rather large optimality gaps are discovered even for feasible-depth circuits: 1.5-12x on average on a smaller device and 5-45x on average on a larger device.
This result is somewhat surprising as, in comparison, the result of the VLSI circuit placement optimality study conducted more than 15 years ago using the PEKO benchmarks [5] revealed that the optimality gaps of the leading academic and industrial placers were 1.43-2.12X on average for circuits with over two million placable objects, while the quantum circuits used in this study have only up to 54 qubits and about 35 thousand quantum gates, yet shown a much large optimality gap.
The optimality gaps revealed in this study have a strong implication. If we can consistently halve the circuit depth by better layout synthesis, we effectively double the decoherence time of a QC device, which is equivalent to a large advancement in experimental physics and electrical engineering. Therefore, the gaps call for more research investments into QC layout synthesis. To draw a parallel, the VLSI placement optimality study using the PEKO benchmarks [5] spurred further research investment in circuit placement, resulting in wirelength reduction equivalent to two or more generations of Moore’s Law scaling, but in a more cost-efficient way [6].
The rest of this paper is organized as follows: Sec. 2 reviews relevant background of QC, Sec. 3 formulates the QC layout synthesis problem; Sec. 4 reviews related work; Sec. 5 provides the construction of QUEKO benchmarks; Sec. 6 evaluates aforementioned tools with QUEKO; Sec. 7 proves the NP-completeness of QC layout synthesis; Sec. 8 gives conclusion.
2 Background
2.1 | irepresentedbyavectorintwo-dimensionalHilbertspacewithL2-normequals1
2https://github.com/quantumlib/Cirq 3
| i=✓↵◆=↵✓10◆+✓01◆=↵|0i+|1i, (1)
where the two basis vectors are |0i and |1i. A quantum state of multiple qubits lies in the tensor product of individual Hilbert spaces. For instance, a general two-qubit state |i is
|i = ↵|0i|0i + |0i|1i + |1i|0i + |1i|1i = (↵ )T , (2) where we omit the tensor product notation ⌦ between |·is for convenience. A joint state of two individual qubits
| 1i⌦| 0iis
2.2 Quantum Gates
| i| i=✓↵◆⌦✓↵0◆=(↵↵0 ↵0 ↵0 0)T. (3) 1 0 0
A quantum gate transforms an input state to an output state. For example, some common single-qubit gates are X, H, and T . † means transpose complex conjugate.
X| i=✓0 1◆✓↵◆=✓◆, H=p2✓1 1◆,T=✓1 0 ◆, T†=✓1 0 ◆. (4) 1 0 ↵ 2 1 1 0 ei⇡/4 0 ei⇡/4
Two common two-qubit gates are CZ and CX (also named CNOT).
01 0 0 01 01 0 0 01
CZ = 1 0 0 CA, CX = 1 0 0CA. (5) 0010 0001
1https://github.com/CQCL/pytket
https://Qiskit.org/

NAND gates are sufficient for universal classical computing. For universal quantum computing, there are multiple complete gate sets. Table 1 lists three such sets chosen by QC frameworks Cirq2, Qiskit3, and pyQuil4. Other gates may be input to these frameworks, but would be translated into a combination of these native gates. The exact matrix representations of these gates are not specified here because they are irrelevant to the purpose of this paper.
Table 1: Complete quantum gate set examples
Cirq (Google) Qiskit (IBM) pyQuil (Rigetti)
Single-qubit gate
Z(), phased X power U1(), U2(, ), U3(✓, , ) RZ(), RX(k⇡/2)
Two-qubit gate
Another important gate is CCX, or Toffoli, gate which is universal for reversible logic and thus essential for QC logic synthesis [7]. It is a quantum gate on three qubits and can be decomposed into a set of single-qubit and two-qubit gates as shown in the following subsection.
01 0 0 0 0 0 0 01 B0 1 0 0 0 0 0 0C B0 0 1 0 0 0 0 0C
CCX=B0 0 0 1 0 0 0 0C (6) B0 0 0 0 1 0 0 0C
0 0 0 0 1 0 0CA 00000001 00000010
An important property between two gates is whether they commute. If the result of first applying gate 0 followed by gate 1 is the same with the result of the other way around, we say these two gates commute. If two gates act on totally different qubits, they always commute. For gates acting on a same qubit, it is nontrivial to judge in general. T and CZ commute because they are both diagonal matrices; but X and H do not commute, since XH 6= HX.
For more background knowledge, the reader can refer to a comprehensive textbook on quantum computing and quantum information [2], or a concise textbook on quantum computing [8] with more computer science flavor.
2.3 Quantum Circuit
In QC, a circuit or program is usually input as a piece of QASM code [9], e.g., Fig. 1a. The code is rather simple to read, merely specifying each gate sequentially like instructions in traditional assembly language. We thus define a quantum circuit to be a list of quantum gates g1g2…gM .
It is important to note that the qubits in a quantum circuit are logical qubits denoted as q1 , …, qN . We do not use “logical” to indicate error correction in this paper. Additionally, we denote qubit count by N , gate count by M , single-qubit gate countbyM1,andtwo-qubitgatecountbyM2.Forinstance,inFig.1,N =3,M =15,M1 =9,M2 =6.
We use the notation of gate like a set of qubits. We say the cardinality |g| = 1 if g is a single-qubit gate and |g| = 2 if g is a two-qubit gate. g [ g0 denotes the set of qubits involved in g or g0. g \ g0 denotes the set of qubits involved ingandg0.Forinstance,inFig.1,q1,q2 2g2,q0 2g13,|g4|=|g6|=2,|g7|=|g9|=1,g1\g2 ={q2},and g8 [g9 ={q0,q1,q2}.
A 1D circuit diagram can also represent the circuit, e.g., Fig. 1b. In such a diagram, each wire stands for a logical qubit. We draw the control qubit of CX gate as • and the target qubit as . The 1D diagram provides some implicit timing information, i.e., the gates on the same horizontal wire are executed from left to right. However, vertical alignment has no indication on timing, e.g., g8 and g9 can be executed simultaneously, but we separate them by some horizontal distance purely to avoid overlapping in the diagram.
2.4 Quantum Computing Device Representation
We represent the layout of a QC device with a device graph G = (P, E) where each node p stands for a physical qubit and each edge e stands for a connection that enables two-qubit entangling gates, i.e., we can only perform such gates
4https://github.com/rigetti/pyquil

OPENQASM 2.0;
include “qelib1.inc”;
qreg q[3]; // Initiate 3 logical qubits
h q[2]; //g1
cx q[1], q[2]; //g2 tdg q[2]; //g3
cx q[0], q[2]; //g4 t q[2]; //g5
cx q[1], q[2]; //g6 tdg q[2]; //g7
cx q[0], q[2]; //g8 t q[1]; //g9
t q[2]; //g10
cx q[0], q[1]; //g11 h q[2]; //g12
t q[0]; //g13
tdg q[1]; //g14
cx q[0], q[1]; //g15
(a) QASM code of Toffoli circuit
g4 g8 g11 g13 g15 q0 •••T•
g2 g6 g9 g14 q1 • • T T†
g1 g3 g5 g7 g10 g12 q2 H T† T T† T H
(b) 1D diagram of Toffoli circuit
q1 g2,g6 g11, g15
q0 q2 g4,g7 (c) Two-qubit gate set of Toffoli circuit
Figure 1: Toffoli circuit (Single-qubit gates are colored gray. Identical two-qubit gates applied at different times have the same color, e.g., g2 and g6 are orange because they are both CX(q1, q2) but at different times.)
on two physical qubits that are connected. This graph is also named as coupling graph or qubit connectivity. Device graphs used in this paper are shown in Fig. 2.
In the coupling graphs, we assume the edges are bi-directional. We made this decision based on the following reasons. First, if the two-qubit gate in the gate library is symmetric, like CZ gate used by Google and Rigetti as shown in Table 1, then it does not make sense to have directed connections. Second, even for asymmetric gates like CX, most of the IBM quantum computers5 now support both directions for every edge. Third, as we shall see in Sec. 5, our construction of QUEKO benchmarks works for directed graphs as well. So, the assumption of bi-directionality does not change the conclusion of this study.
3 Problem Formulation
Layout synthesis is divided into two sub-tasks: initial placement that produces an initial mapping from logical qubits to physical qubits, and gate scheduling that decides when and where to execute each input gate and insert SWAP gates to make two-qubit gates legal on the device graph. However, a SWAP gate may increase the depth, and definitely increase the gate count and the error rate. In the whole QC compilation/mapping workflow, there are usually circuit optimization
5https://quantum-computing.ibm.com/

(a) IBM’s Ourense device graph
(b) Rigetti’s Aspen-4 device graph
(c) IBM’s Tokyo device graph
(d) IBM’s Rochester device graph
Figure2: QCdeviceexamples
stages before or after layout synthesis. During the optimization stages, gate reduction and commutation are performed, e.g., [10]. We assume that these optimizations are already applied prior to layout synthesis, so that every input gate should be executed and the relative order of input gates should not change in the layout synthesis process.
physical qubits logical qubits
(a) Initial placement for Toffoli circuit
g ̃14 g ̃16 g ̃18 p0 • T •
(e) Google’s Sycamore device graph
g ̃4 q0 p0 •
q1p3 • • T p1 T†
g ̃1 g ̃3 g ̃5 g ̃7 g ̃10 g ̃15 q2p1HT†TT†T p3H
t 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 (b) Scheduled Toffoli circuit
Figure 3: Quantum computing layout synthesis for Toffoli circuit on IBM’s Ourense device
g ̃11 g ̃12 g ̃13 ••

3.1 Initial Placement
During initial placement, we need to find a mapping from logical qubits in the quantum circuit to physical qubits on the device μ0 : Q ! P that benefits subsequent gate scheduling. If the two-qubit gate set, consisting of all the two-qubit gates, can be embedded in the device graph during initial placement (e.g., Fig. 1c can be embedded in Fig. 2c), gate scheduling does not necessarily need to insert any gates. However, the case is not so ideal in general. For example, if we want to map the Toffoli circuit as shown in Fig. 1b onto device Ourense as shown in Fig. 2a, there must be some additional gates, since the device graph does not contain any triangles, but the two-qubit gates in Fig. 1c forms a triangle. A valid initial placement in this case is given in Fig. 3a where μ0(q0) = p0, μ0(q1) = p3, and μ0(q2) = p1.
3.2 Gate Scheduling
Given a quantum circuit g1 …gM , e.g., Fig. 1b, gate scheduling produces the spacetime coordinates (tj , xj ) for each gate. The coordinates specify when and where the gates are applied. We say that a gate is scheduled to cycle t if its time coordinate is t. For a single-qubit gate, the space coordinate is a physical qubit, i.e., xj 2 P ; for a two-qubit gate, it is an edge in the device graph, i.e., xj 2 E. SWAP gates may need to be inserted during gate scheduling to ensure that all two-qubit gates are executable. The input gate plus the inserted SWAP gates constitute the scheduled gate list g ̃1, …, g ̃M ̃ . Since only SWAP gates are inserted and all the input gates are contained in the scheduled gate list, the functionality of input circuit remains unchanged after the layout synthesis process. Additionally, gate scheduling must respect dependencies in the quantum circuit. If gate g acts on qubit q, then g can only be executed after all prior gates, which act on qubit q, are executed.
A valid but not necessarily optimal gate scheduling example is given in Fig. 3b. The time coordinates for all the gates are displayed at the bottom, e.g., t1 = 1 and t15 = 13. The space coordinates can be inferred from the mapping, e.g. x1 = p1, x2 = (p3, p1), x14 = (p0, p1), and x15 = p3. There is an injective map from the original gates to the scheduledgates:f(i)=ifori=1to10andf(i)=i+3fori=11to15suchthatgi =g ̃f(i) fori=1to15.The three CX gates in the dashed box g ̃11, g ̃12, and g ̃13 constitute a SWAP gate. The adjusted mapping is shown after the them. The SWAP gate is inserted so that g ̃14 and g ̃18 are on connected qubits p0 and p1, thus executable.
3.3 Formal Definition of (Depth-Optimal) Layout Synthesis Problem in Quantum Computing
Input A device graph G = (P, E) and a list of quantum gates g1…gM acting on logical qubit set Q. All the input gates are in the gate set of the device, e.g., a set from Table 1. Logical qubits are less or equal than physical qubits, i.e., |Q|  |P|.
Output An initial mapping μ0 : Q ! P, and a scheduled quantum circuit consists of a new list of gates g ̃1…g ̃M ̃ , including SWAP gates, where each gate has a spacetime coordinate (tj , xj ). We use tilde to denote that a gate is scheduled from here on.
Constraints
• Feasible two-qubit gates: all the two-qubit gates in the scheduled circuit must be on two qubits connected in
thedevicegraph.Formally,forj=1toM ̃,if|g ̃j|=2,thenxj 2E.
• Executing all gates: all input gates should be executed. Formally, there is an injective map f : {1, …, M } !
{1,…,M ̃}suchthatgi =g ̃f(i) fori=1toM.
• Respectingdependencies:fori,i0 =1toM,ifiCS代考 加微信: powcoder QQ: 1823890830 Email: powcoder@163.com