A Quantum Approximate Optimization Algorithm
Edward Farhi and
Center for Theoretical Physics Massachusetts Institute of Technology Cambridge, MA 02139
Copyright By PowCoder代写 加微信 powcoder
We introduce a quantum algorithm that produces approximate solutions for combinatorial op- timization problems. The algorithm depends on an integer p ≥ 1 and the quality of the approx- imation improves as p is increased. The quantum circuit that implements the algorithm consists of unitary gates whose locality is at most the locality of the objective function whose optimum is sought. The depth of the circuit grows linearly with p times (at worst) the number of constraints. If p is fixed, that is, independent of the input size, the algorithm makes use of efficient classical pre- processing. If p grows with the input size a different strategy is proposed. We study the algorithm as applied to MaxCut on regular graphs and analyze its performance on 2-regular and 3-regular graphs for fixed p. For p = 1, on 3-regular graphs the quantum algorithm always finds a cut that is at least 0.6924 times the size of the optimal cut.
MIT-CTP/4610
arXiv:1411.4028v1 [quant-ph] 14 Nov 2014
I. INTRODUCTION
Combinatorial optimization problems are specified by n bits and m clauses. Each clause
is a constraint on a subset of the bits which is satisfied for certain assignments of those bits
and unsatisfied for the other assignments. The objective function, defined on n bit strings,
is the number of satisfied clauses,
where z = z1z2 . . . zn is the bit string and Cα(z) = 1 if z satisfies clause α and 0 otherwise.
Typically Cα depends on only a few of the n bits. Satisfiability asks if there is a string that satisfies every clause. MaxSat asks for a string that maximizes the objective function. Approximate optimization asks for a string z for which C(z) is close to the maximum of C. In this paper we present a general quantum algorithm for approximate optimization. We study its performance in special cases of MaxCut and also propose an alternate form of the algorithm geared toward finding a large independent set of vertices of a graph.
The quantum computer works in a 2n dimensional Hilbert space with computational basis vectors |z⟩, and we view (1) as an operator which is diagonal in the computational basis. Define a unitary operator U(C,γ) which depends on an angle γ,
All of the terms in this product commute because they are diagonal in the computational
basis and each term’s locality is the locality of the clause α. Because C has integer eigen- values we can restrict γ to lie between 0 and 2π. Define the operator B which is the sum of
all single bit σx operators,
U(C,γ) = e−iγC =
e−iγCα . (2)
Now define the β dependent product of commuting one bit operators
U(B,β) = e−iβB =
where β runs from 0 to π. The initial state |s⟩ will be the uniform superposition over
computational basis states:
|s⟩ = √2n |z⟩ . (5)
For any integer p ≥ 1 and 2p angles γ1…γp ≡ γ and β1…βp ≡ β we define the angle dependent quantum state:
|γ,β⟩ = U(B,βp)U(C,γp)···U(B,β1)U(C,γ1) |s⟩. (6) Even without taking advantage of the structure of the instance, this state can be produced
by a quantum circuit of depth at most mp + p. Let Fp be the expectation of C in this state Fp(γ,β) = ⟨γ,β|C |γ,β⟩. (7)
and let Mp be the maximum of Fp over the angles,
Mp = max Fp (γ, β). (8)
Note that the maximization at p − 1 can be viewed as a constrained maximization at p so
Mp ≥ Mp−1. (9) lim Mp = max C(z). (10)
These results suggest a way to design an algorithm. Pick a p and start with a set of angles (γ, β) that somehow make Fp as large as possible. Use the quantum computer to get the state |γ,β⟩. Measure in the computational basis to get a string z and evaluate C(z). Repeat with the same angles. Enough repetitions will produce a string z with C(z) very near or greater than Fp(γ, β). The rub is that it is not obvious in advance how to pick good angles.
If p doesn’t grow with n, one possibility is to run the quantum computer with angles (γ, β) chosen from a fine grid on the compact set [0, 2π]p × [0, π]p, moving through the grid to find the maximum of Fp. Since the partial derivatives of Fp(γ,β) in (7) are bounded by O(m2 + mn) this search will efficiently produce a string z for which C(z) is close to Mp or larger. However we show in the next section that if p does not grow with n and each bit is involved in no more than a fixed number of clauses, then there is an efficient classical calculation that determines the angles that maximize Fp. These angles are then used to run the quantum computer to produce the state |γ, β⟩ which is measured in the computational basis to get a string z. The mean of C(z) for strings obtained in this way is Mp.
Furthermore we will later show that
II. FIXED p ALGORITHM
We now explain how for fixed p we can do classical preprocessing and determine the angles γ and β that maximize Fp(γ,β). This approach will work more generally but we illustrate it for a specific problem, MaxCut for graphs with bounded degree. The input is a graph with n vertices and an edge set {⟨jk⟩} of size m. The goal is to find a string z that makes
C = C⟨jk⟩, ⟨jk⟩
C ⟨ j k ⟩ = 1 − σ jz σ kz + 1 , 2
as large as possible. Now
Fp(γ,β) = ⟨s|U†(C,γ1)···U†(B,βp)C⟨jk⟩U(B,βp)···U(C,γ1)|s⟩.
Consider the operator associated with edge ⟨jk⟩
U†(C,γ1)···U†(B,βp)C⟨jk⟩U(B,βp)···U(C,γ1). (14)
This operator only involves qubits j and k and those qubits whose distance on the graph from j or k is less than or equal to p. To see this consider p = 1 where the previous expression is
U†(C, γ1) U†(B, β1)C⟨jk⟩U(B, β1) U(C, γ1). (15) The factors in the operator U(B,β1) which do not involve qubits j or k commute through
C⟨jk⟩ and we get
U†(C,γ1)eiβ1(σjx+σkx)C⟨jk⟩e−iβ1(σjx+σkx) U(C,γ1). (16)
Any factors in the operator U(C,γ1) which do not involve qubits j or k will commute through and cancel out. So the operator in equation (16) only involves the edge ⟨jk⟩ and edges adjacent to ⟨jk⟩, and qubits on those edges. For any p we see that the operator in (14) only involves edges at most p steps away from ⟨jk⟩ and qubits on those edges.
Return to equation (13) and note that the state |s⟩ is the product of σx eigenstates
|s⟩ = |+⟩1 |+⟩2 …|+⟩n (17)
so each term in equation (13) depends only on the subgraph involving qubits j and k and those at a distance no more than p away. These subgraphs each contain a number of qubits that is independent of n (because the degree is bounded) and this allows us to evaluate Fp in terms of quantum subsystems whose sizes are independent of n.
As an illustration consider MaxCut restricted to input graphs of fixed degree 3. For p = 1, there are only these possible subgraphs for the edge ⟨jk⟩:
We will return to this case later.
For any subgraph G define the operator CG which is C restricted to G,
and the associated operator
Alsodefine
U(CG, γ) = e−iγ CG . (20) x
CG = C⟨ll′⟩, (19) ⟨ll′ ⟩εG
BG= σj (21) jεG
U(BG,β) = e−iβBG. (22)
Let the state |s, G⟩ be
Return to equation (13). Each edge ⟨j, k⟩ in the sum is associated with a subgraph g(j, k)
|s, G⟩ = |+⟩l . lεG
and makes a contribution to Fp of
⟨s,g(j,k)|U† (Cg(j,k),γp)···U†(Bg(j,k),β1)C⟨jk⟩U (Bg(j,k),β1)···U (Cg(j,k),γp)|s,g(j,k)⟩
(23) The sum in (13) is over all edges, but if two edges ⟨jk⟩ and ⟨j′k′⟩ give rise to isomorphic
subgraphs, then the corresponding functions of (γ, β) are the same. Therefore we can view 5
the sum in (13) as a sum over subgraph types. Define
fg (γ, β) = ⟨s, g(j, k)| U†(Cg(j,k), γ1) · · · U†(Bg(j,k), βp)C⟨jk⟩U(Bg(j,x)βp) · · ·
where g(j, k) is a subgraph of type g. Fp is then
Fp (γ,β) = wg fg(γ,β)
U(Cg(j,k), γ1) |s, g(j, k)⟩ , (24) (25)
where wg is the number of occurrences of the subgraph g in the original edge sum. The
functions fg do not depend on n and m. The only dependence on n and m comes through the weights wg and these are just read off the original graph. Note that the expectation in (24) only involves the qubits in subgraph type g. The maximum number of qubits that can appear in (23) comes when the subgraph is a tree. For a graph with maximum degree v, the numbers of qubits in this tree is
qtree =2(v−1)p+1 −1, (26) (v − 1) − 1
(or 2p + 2 if v = 2), which is n and m independent. For each p there are only finitely many subgraph types.
Using (24), Fp(γ,β) in (25) can be evaluated on a classical computer whose resources are not growing with n. Each fg involves operators and states in a Hilbert space whose dimension is at most 2qtree. Admittedly for large p this may be beyond current classical technology, but the resource requirements do not grow with n.
To run the quantum algorithm we first find the (γ,β) that maximize Fp. The only dependence on n and m is in the weights wg and these are easily evaluated. Given the best (γ,β) we turn to the quantum computer and produce the state |γ,β⟩ given in equation (6). We then measure in the computational basis and get a string z and evaluate C(z). Repeating gives a sample of values of C(z) between 0 and +m whose mean is Fp(γ,β). An outcome of at least Fp(γ, β) − 1 will be obtained with probability 1 − 1/m with order m log m repetitions.
III. CONCENTRATION
Still using MaxCut on regular graphs as our example, it is useful to get information about the spread of C measured in the state |γ,β⟩. If v is fixed and p is fixed (or grows
slowly with n) the distribution of C(z) is actually concentrated near its mean. To see this, calculate
⟨γ,β|C2 |γ,β⟩−⟨γ,β|C|γ,β⟩2 (27) = ⟨s|U†(C,γ1)···U†(B,βp)C⟨jk⟩ C⟨j′k′⟩ U(B,βp)···U(C,γ1)|s⟩
⟨jk⟩ ⟨j′k′⟩
−⟨s|U†(C,γ1)···U†(B,βp)C⟨jk⟩ U(B,βp)···U(C,γ1)|s⟩
·⟨s|U†(C,γ1)···U†(B,βp)C⟨j k ⟩ p 1
If the subgraphs g(j,k) and g(j′,k′) do not involve any common qubits, the summand in (28) will be 0. The subgraphs g(j,k) and g(j′,k′) will have no common qubits as long as there is no path in the instance graph from ⟨jk⟩ to ⟨j′k′⟩ of length 2p + 1 or shorter. From (26) with p replaced by 2p + 1 we see that for each ⟨jk⟩ there are at most
2(v−1)2p+2 −1 (29) (v − 1) − 1
U(B,β )···U(C,γ )|s⟩ . (28)
edges ⟨j′k′⟩ which could contribute to the sum in (28) (or 4p + 4 if v = 2) and therefore ⟨γ,β|C2|γ,β⟩−⟨γ,β|C|γ,β⟩2 2(v−1)2p+2 −1·m (30)
(v − 1) − 1
since each summand is at most 1 in norm. For v and p fixed we see that the standard
deviation of C(z) is at most of order √
values of C(z) will be within 1 of Fp(γ, β) with probability 1 − 1 . The concentration of
m. This implies that the sample mean of order m2 m
the distribution of C(z) also means that there is only a small probability that the algorithm will produce strings with C(z) much bigger than Fp(γ,β).
IV. THE RING OF DISAGREES
We now analyze the performance of the quantum algorithm for MaxCut on 2-regular graphs. Regular of degree 2 (and connected) means that the graph is a ring. The objective operator is again given by equation (11) and its maximum is n or n−1 depending on whether n is even or odd. We will analyze the algorithm for all p.
For any p (less than n/2), for each edge in the ring, the subgraph of vertices within p of the edge is a segment of 2p + 2 connected vertices with the given edge in the middle. So for
each p there is only one type of subgraph, a line segment of 2p + 2 qubits and the weight for this subgraph type is n. We numerically maximize the function given in (24) and we find that for p = 1, 2, 3, 4, 5 and 6 the maxima are 3/4, 5/6, 7/8, 9/10, 11/12, and 13/14 to 13 decimal places from which we conclude that Mp = n(2p + 1)/(2p + 2) for all p. So the quantum algorithm will find a cut of size n(2p + 1)/(2p + 2) − 1 or bigger. Since the best cut is n, we see that our quantum algorithm can produce an approximation ratio that can be made arbitrarily close to 1 by making p large enough, independent of n. For each p the circuit depth can be made 3p by breaking the edge sum in C into two sums over ⟨j, j + 1⟩ with j even and j odd. So this algorithm has a circuit depth independent of n.
V. MAXCUT ON 3-REGULAR GRAPHS
We now look at how the Quantum Approximate Optimization Algorithm, the QAOA, performs on MaxCut on (connected) 3-regular graphs. The approximation ratio is C(z), where z is the output of the quantum algorithm, divided by the maximum of C. We first show that for p = 1, the worst case approximation ratio that the quantum algorithm produces is 0.6924.
Suppose a 3-regular graph with n vertices (and accordingly 3n/2 edges) contains T “iso- lated triangles” and S “crossed squares”, which are subgraphs of the form,
The dotted lines indicate edges that leave the isolated triangle and the crossed square. To say that the triangle is isolated is to say that the 3 edges that leave the triangle end on distinct vertices. If the two edges that leave the crossed square are in fact the same edge, then we have a 4 vertex disconnected 3-regular graph. For this special case (the only case where the analysis below does not apply) the approximation ratio is actually higher than 0.6924. In general, 3T + 4S ≤ n because no isolated triangle and crossed square can share a vertex.
Return to the edge sum in F1(γ,β) of equation (13). For each crossed square there is 8
one edge ⟨jk⟩ for which g(j, k) is the first type displayed in (18). Call this subgraph type g4 because it has 4 vertices. In each crossed square there are 4 edges that give rise to subgraphs of the second type displayed in (18). We call this subgraph type g5 because it has 5 vertices. All 3 of the edges in any isolated triangle have subgraph type g5, so there are 4S + 3T edges with subgraph type g5. The remaining edges in the graph all have a subgraph type like the third one displayed in (18) and we call this subgraph type g6. There are (3n/2 − 5S − 3T ) of these so we have
F1 (γ , β ) = S fg4 (γ , β ) + (4S + 3T ) fg5 (γ , β ) + 3n − 5S − 3T fg6 (γ , β ) (32) 2
The maximum of F1 is a function of n, S, and T,
M1(n, S, T ) = max F1(γ, β). (33)
Given any 3 regular graph it is straightforward to count S and T. Then using a classical computer it is straightforward to calculate M1 (n, S, T ). Running a quantum computer with the maximizing angles γ and β will produce the state |γ,β⟩ which is then measured in the computational basis. With order n log n repetitions a string will be found whose cut value is very near or larger than M1(n,S,T).
To get the approximation ratio we need to know the best cut that can be obtained for the input graph. This is not just a function of S and T . However a graph with S crossed squares and T isolated triangles must have at least one unsatisfied edge per crossed square and one unsatisfied edge per isolated triangle so the number of satisfied edges is ≤ (3n/2 − S − T ). This means that for any graph characterized by n, S and T the quantum algorithm will produce an approximation ratio that is at least
M1(n,S,T). (34) 3n − S − T
It is convenient to scale out n from the top and bottom of (34). Note that M1/n which
comes from F1/n depends only on S/n ≡ s and T/n ≡ t. So we can write (34) as M1(1, s, t)
3 −s−t (35) 2
where s, t ≥ 0 and 4s + 3t ≤ 1. It is straightforward to numerically evaluate (35) and we find that it achieves its minimum value at s = t = 0 and the value is 0.6924. So we know that on any 3-regular graph, the QAOA will always produce a cut whose size is at least 0.6924
times the size of the optimal cut. This p = 1 result on 3-regular graphs is not as good as known classical algorithms [1].
It is possible to analyze the performance of the QAOA for p = 2 on 3-regular graphs. However it is more complicated then the p = 1 case and we will just show partial results. The subgraph type with the most qubits is this tree with 14 vertices:
Numerically maximizing (24) with g given by (36) yields 0.7559. Consider a 3-regular graph on n vertices with o(n) pentagons, squares and triangles. Then all but o(n) edges have (36) as their subgraph type. The QAOA at p = 2 cannot detect whether the graph is bipartite, that is, completely satisfiable, or contains many odd loops of length 7 or longer. If the graph is bipartite the approximation ratio is 0.7559 in the limit of large n. If the graph contains many odd loops (length 7 or more), the approximation ratio will be higher.
VI. RELATION TO THE QUANTUM ADIABATIC ALGORITHM
We are focused on finding a good approximate solution to an optimization problem whereas the Quantum Adiabatic Algorithm, QAA [2], is designed to find the optimal solu- tion and will do so if the run time is long enough. Consider the time dependent Hamiltonian H (t) = (1 − t/T )B + (t/T )C . Note that the state |s⟩ is the highest energy eigenstate of B and we are seeking a high energy eigenstate of C. Starting in |s⟩ we could run the quan- tum adiabatic algorithm and if the run time T were long enough we would find the highest energy eigenstate of C. Because B has only non-negative off-diagonal elements, the Perron- Frobenius theorem implies that the difference in energies between the top state and the one below is greater than 0 for all t < T, so for sufficiently large T success is assured. A Trot- terized approximation to the evolution consists of an alternation of the operators U(C,γ) and U(B,β) where the sum of the angles is the total run time. For a good approximation we want each γ and β to be small and for success we want a long run time so together these
force p to be large. In other words, we can always find a p and a set of angles γ,β that make Fp(γ,β) as close to Mp as desired. With (9), this proves the assertion of (10).
The previous discussion shows that we can get a good approximate solution to an opti- mization problem by making p sufficiently large, perhaps exponentially large in n. But the QAA works by producing a state with a large overlap with the optimal string. In this sense (10), although correct, may be misleading. In fact on the ring of disagrees the state pro- duced at p = 1, which gives a 3/4 approximation ratio, has an exponentially small overlap with the optimal strings.
We also know an example where the QAA fails and the QAOA succeeds. In this example (actually a minimization) the objective function is symmetric in the n bits and therefore depends only on the Hamming weight. The objective function is plotted in figure 1 of reference [3]. Since the beginning Hamiltonian is also symmetric the evolution takes place in a subspace of dimension n + 1 with a basis of states |w⟩ indexed by the Hamming weight. The example can be simulated and analyzed for large n. For subexponential run times, the QAA is trapped in a false minimum at w = n. The QAOA can be similarly simulated and analyzed. For large n, even with p = 1, there are values of γ1 and β1 such that the final state is concentrated near the true minimum at w = 0.
The Quantum Approximate Optimization Algorithm has the key feature that as p in- creases the approximation improves. We contrast this to the performance of the QAA. For realizations of the QAA there is a total run time T that also appears in the instantaneous Hamiltonian, H(t) = H(t/T). We start in the ground state of H(0) seeking the ground state of H(1). As T goes to infinity the overlap of the evolved state with the desired state goes to 1. However the success probability is generally not a monotonic function of T. See figure 2 of reference [4] for an extreme example where the success probability is plotted as a function of T for a particular 20 qubit instance of Max2Sat. The probability rises and then drops dramatically, and the ultimate rise for large T is not seen for times that can be reasonably simulated. It may well be advantageous in designing strategies for the QAOA to use the fact that the approximation improves as p increases.
VII. A VARIANT OF THE ALGORITHM
We are now going to give a variant of the basic algorithm which is suited to situations where th
程序代写 CS代考 加微信: powcoder QQ: 1823890830 Email: powcoder@163.com