Prof. Mark Bun
CAS CS 591 B: Communication Complexity
Lecture Notes 13:
Unbounded error communication
Fall 2019
Reading.
• Paturi-Simon, Probabilistic Communication Complexity
We’ll begin studying a very strong model of probabilistic communication complexity called the unbounded error model. In this model we are interested in computing a function with any advantage over random guessing. It will be convenient for us to change the range of our Boolean functions to {−1, 1} instead of {0, 1}.
Denition 1. The unbounded error communication complexity of a function f : X × Y → {−1, 1}, denoted UPPcc(f), is the minimum length of a private coin protocol Π such that for every (x,y)
Pr[Π(x, y) = f(x, y)] > 1/2 ⇐⇒ sgn E[Π(x, y)] = f(x, y).
This denition should be compared with that of the PPcc model which we dened earlier when discussing discrepancy. Recall that the PPcc cost of a protocol charges for both its length and for log(1/δ), where δ is its advantage over random guessing on a worst-case input. One way to think about this additional charge of log(1/δ) is as a proxy for charging for the amount of randomness used in the protocol. Moreover, by Newman’s theorem, the denition of PPcc changes by only additive log terms depending on whether we allow public randomness. The UPPcc model, on the other hand, is only interesting in the absence of public randomness. If we were to permit the use of public randomness, then every function would have an O(1) cost protocol. Alice could just send a bit indicating whether her string x is equal to the rst n bits of the public random string. If so, then Bob can compute f(x,y), and otherwise, he can output a random guess. This protocol has advantage 2−n.
Even if we only allow for private coins, many functions have ecient UPPcc protocols. Example 2. UPPcc(EQn), UPPcc(GTn) = O(1) and UPPcc(GHn), UPPcc(DISJn) = O(log n).
We’ve seen protocols for the latter two before. The easiest way to see that there are constant communication protocols for Equality and Greater-Than is by studying equivalent characterizations of UPPcc.
1 Sign Rank
UPPcc has an extremely clean matrix analytic characterization. Recall the log rank conjecture, which asserts that deterministic communication complexity of f is characterized up to polynomial factors by log rank Mf . The analog of this conjecture for UPPcc complexity is true in the the sharpest possible way.
1
Denition 3. The sign rank of a matrix M with entries in {−1,1}, denoted rank±M, is the minimum rank of a real-valued matrix R such that sgn R[x, y] = M [x, y] for every (x, y).
Theorem 4. For a function f : X × Y → {−1, 1}, let Sf ∈ R|X|×|Y | be its sign matrix Sf [x, y] = f(x,y). Then
UPPcc(f) = log rank±(Sf ) ± O(1).
Proof. We rst show that log rank±(Sf ) ≤ UPPcc(f) + O(1). Suppose Π is a UPPcc protocol computing f with communication cost c. Dene the matrix R[x, y] = E[Π(x, y)]. Then sgn R[x, y] = Sf [x, y], so we just need to show that rank R ≤ 2c. Write
E[Π(x, y)] = Pr[Π(x, y) reaches l] − Pr[Π(x, y) reaches l], l:Π outputs 1 at leaf l l:Π outputs −1 at leaf l
and for each such leaf l dene the matrix Rl[x, y] = Pr[Π(x, y) reaches l] = pl(x) · ql(y) for some functions pl,ql, using the fact that Π is a private coin protocol. This implies that every Rl is a rank-1 matrix, and hence
R= Rl− Rl l:Π outputs 1 at leaf l l:Π outputs −1 at leaf l
has rank at most 2c.
For the other direction, we want to show that if rank±(Sf ) = r, then we can construct a UPPcc
protocol for f with cost log r + O(1). The condition rank±(Sf ) = r is equivalent to the existence ofvectors{ux ∈Rr :x∈X}and{vy ∈Rr :y∈Y}suchthatsgn⟨ux,vy⟩=f(x,y)forevery x,y. Assume without loss of generality that the vectors are normalized so that every ∥ux∥1 = 1 and ∥vy∥1 = 1. Hence each |(ux)1|,…,|(ux)r| represents a probability distribution over [r] (and similarly for vy ). This suggests the following cost log r + O(1) protocol for computing f :
1. Alice samples i ∈ [r] with probability |(ux)i|. She sends i together with a = sgn(ux)i to Bob. 2. Bob samples b ∈ {−1, 1} such that E[b] = (vy )i and outputs ab.
We compute the advantage of this protocol as follows:
whose sign agrees with f(x,y).
r
E[Π(x, y)] = |(ux)i|E[ab|i]
i=1 r
= (ux)iE[b|i] i=1
r
= (ux)i(vy)i
i=1
= ⟨ux,vy⟩
2
2 Arrangements of Hyperplanes
A hyperplane through the origin in Rd is specied by its normal vector v ∈ Rd and consists of the points u ∈ Rd such that ⟨u, v⟩ = 0. It divides Rd into two halfspaces: a positive side consisting of points u with ⟨u, v⟩ > 0 and a negative side where ⟨u, v⟩ < 0.
A collection of hyperplanes V = {vy ∈ Rd : y ∈ Y } realizes a matrix {−1, 1} ∈ R|X|×|Y | if for every row vector mx ∈ {−1, 1}|Y | of M there exists a point ux ∈ Rd such that
(mx)y = sgn⟨ux, vy⟩
for every y ∈ Y . One way to think about what's going on is that a set of hyperplanes divides Rd into (at most) 2|Y | regions. To each region, we can assign a |Y |-bit string indicating which side of each hyperplane that region lies in. The collection of hyperplanes realizes M if every row of M appears as one of these strings.
The sign rank of a matrix M is the minimum dimension d over which there is a collection of hyperplanes realizing M. In one direction, suppose M has sign rank d. Then there exist vectors {ux ∈Rr :x∈X}and{vy ∈Rr :y∈Y}suchthatsgn⟨ux,vy⟩=M[x,y]. Sowecantakethe ux's to be the points and the vy's to the the hyperplanes. Conversely, a collection of hyperplanes realizing M induces the sets of of vectors required to show that M has sign rank d.
This geometric interpretation of sign rank is useful for several reasons. One is that it gives rise to machine learning algorithms; in fact, the fastest known algorithm for PAC learning DNF is based on a sign rank upper bound. To see the connection, suppose we have a collection of functions {fy : X → {−1,1}} indexed by y ∈ Y. Let (x1,fy∗(x1)),...,(xk,fy∗(xk)) be a sequence of labeled samples from X. We'd like to be able to learn (an approximation to) the function fy∗. If the matrix M[x,y] = fy(x) has sign rank r, then the problem of nding fy∗ reduces to the problem of nding an r-dimensional hyperplane v which separates the points uxi for which fy∗ (xi) < 0 from the points uxi for which fy∗ (xi) > 0. This can be done in poly(r) time using linear programming.
The other reason is that it lets us design the very ecient communication protocols mentioned at the beginning of the lecture. For instance, to design an arrangement of hyperplanes realizing the communication matrix of EQn, we can take the points ux to be evenly spaced along the unit circle in R2 and the hyperplanes to be just below the tangents to the circle at these points. The dimension of this arrangement is 2, hence the sign rank of the communication matrix is 2 and there is an O(1)-communication protocol for this problem.
Next time, we’ll talk about how to prove strong lower bounds on sign rank and UPPcc com- munication.
3