CS代考计算机代写 data structure scheme algorithm chain AI decision tree PROPERTY TESTING LOWER BOUNDS VIA COMMUNICATION COMPLEXITY

PROPERTY TESTING LOWER BOUNDS VIA COMMUNICATION COMPLEXITY
Eric Blais, Joshua Brody, and Kevin Matulef
February 21, 2012
Abstract. We develop a new technique for proving lower bounds in property testing, by showing a strong connection between testing and communication complexity. We give a simple scheme for reducing com- munication problems to testing problems, thus allowing us to use known lower bounds in communication complexity to prove lower bounds in testing. This scheme is general and implies a number of new testing bounds, as well as simpler proofs of several known bounds.
For the problem of testing whether a boolean function is k-linear (a par- ity function on k variables), we achieve a lower bound of Ω(k) queries, even for adaptive algorithms with two-sided error, thus confirming a conjecture of Goldreich (2010a). The same argument behind this lower bound also implies a new proof of known lower bounds for testing re- lated classes such as k-juntas. For some classes, such as the class of monotone functions and the class of s-sparse GF(2) polynomials, we significantly strengthen the best known bounds.
Keywords. Property testing, Communication complexity, Lower bounds.
Subject classification. 68Q17
1. Introduction
The field of property testing seeks to formalize the question: what can we determine about a large object, with limited access to the object itself? In general the large object may be anything—for

2 Blais, Brody & Matulef
instance a graph on n nodes, or a function on n variables. In a typical property testing setup, a tester who has unbounded com- putational power is given query access to the large object. The tester’s goal is to accept the object if it has some property P, and reject it if it is “far” from having property P.
In this paper we will primarily concern ourselves with the case when the large object is a boolean function f on n bits. In this case, the tester’s goal is to accept f with probability at least 2/3 if f has property P, and reject with probability at least 2/3 if f must be modified on an ε fraction of the 2n possible inputs in order to have property P. The query complexity (i.e. the number of times the testing algorithm must query f) should hopefully be a small function of ε and n.
The notion of testing boolean functions in this framework goes back to the seminal work of Rubinfeld & Sudan (1996) and has several connections to complexity theory (in particular PCPs and hardness of approximation), as well as computational learning the- ory (Ron 2008). Over the last two decades, researchers have ex- erted a considerable amount of effort to determine the query com- plexity for testing properties of a function f, such as whether f is a linear function (Blum et al. 1993), whether f is isomorphic to a given function (Alon & Blais 2010; Blais & O’Donnell 2010; Chakraborty et al. 2011b), whether f is a k-junta (Blais 2008, 2009; Fischer et al. 2004), a monotone function (Fischer et al. 2002; Goldreich et al. 2000), a dictator (Parnas et al. 2002), a halfspace (Matulef et al. 2009), an s-sparse polynomial, a size-s decision tree, etc. (Diakonikolas et al. 2007). Starting with the ground-breaking work of Goldreich et al. (1998), there has also been much effort directed at determining the query complexity for testing proper- ties of graphs and, more generally, of combinatorial objects. (See, e.g., Goldreich 2010b; Ron 2009.)
Over the course of this effort, a variety of techniques have been developed for designing property testing algorithms, thus proving testing upper bounds. However, as is often the case in theoretical computer science, lower bounds are harder to come by. Although several lower bounds for specific problems are known, few general techniques are known beyond the use of Yao’s minimax lemma.

Property Testing Lower Bounds 3
Communication complexity is an area which has collected many effective techniques for proving lower bounds in other areas of com- puter science. In a typical setup, two parties, Alice and Bob, each have an input and they would like to decide something about their joint input. Their computational power is unbounded, but they would like to compute the answer with as little communication as possible.
The communication complexity framework has been extensively studied; in particular, several problems are known to require a large amount of communication. These include set-disjointness, in- dex, inner-product, and gap-hamming-distance. The hard- ness of these and related problems has been used to obtain lower bounds in many areas such as streaming algorithms, circuit com- plexity, data structures, and proof complexity (Indyk & Woodruff 2003; Kushilevitz & Nisan 1997; Miltersen et al. 1995).
Property testing and communication complexity have striking similarities. Both involve parties with unbounded computational power (in one case, the tester, and in the other case, the commu- nicating players), and both involve algorithms which are restricted by the parties’ limited access to their input. Despite these similar- ities, no previous connection between these fields has been made.
In this work we show that there is indeed a strong connection between testing and communication complexity. More specifically, we show how to reduce certain communication problems to prop- erty testing problems. This reduction method represents a new approach to proving testing lower bounds. This approach turns out to be quite fruitful, both for proving new bounds and for giv- ing simpler proofs of known bounds in property testing.
1.1. Our Results. A detailed description of our results follows below.
Testing k-linear functions. The function f : {0, 1}n → {0, 1} is linear, i.e. a parity function, when there exists a set S = {i1,…,is} ⊆ [n] such that for every x ∈ {0,1}n, f(x) = xi1 ⊕···⊕ xis . When |S| = k, we say that f is a k-linear function. The prob- lem of testing k-linear functions was first studied by Fischer et al.

4 Blais, Brody & Matulef
(2004). The best lower bound is due to Goldreich (2010a), who
k) queries are required to test k-linear functions. He also showed that non-adaptive testers require Ω(k) queries to test the same property, and conjectured that this stronger lower
bound holds for all testers (adaptive or not).1
We confirm Goldreich’s conjecture. Using the same construc-
tion, we also obtain lower bounds on the query complexity for testing juntas,2 testing functions of low Fourier degree, and testing sparse polynomials.
Theorem 1.1. Fix 1 < k < n−1. At least Ω(min{k,n−k}) queries are required to test (i) k-linear functions, (ii) k-juntas, (iii) functions of Fourier degree at most k, and (iv) functions with k-sparse polynomial representation in GF(2). Remark 1.2. In parallel work, Blais & Kane (2011) simultane- ously obtained a different proof of Goldreich’s conjecture using Fourier-analytic methods. Theorem 1.1 has implications for the problem of isomorphism testing, or testing whether an unknown function f is equivalent, up to permutation of variables, to a fixed function g : {0,1}n → {0,1}. Alon & Blais (2010) showed that for most functions g, testing g-isomorphism non-adaptively requires Ω(n) queries. Sim- ilarly, Chakraborty et al. (2011b) showed that for every k ≤ n, 1The conjecture and results of Goldreich (2010a) are stated in terms of testing ≤k-linear functions (the class of functions that are parities on at most k bits) but the proofs of Goldreich (2010a) give identical lower bounds for testing k-linearity. Similarly, the technique in our lower bounds for testing k-linearity gives identical bounds for testing ≤k-linearity. 2Informally, a function is a k-junta if it has at most k relevant variables. For a complete definition of the properties stated in Theorem 1.1, see Section 3. Similarly, the properties introduced in the rest of the introduction are defined formally in the sections containing the proofs of the corresponding theorems. showed that Ω( √ Property Testing Lower Bounds 5 there exists a k-junta g such that testing g-isomorphism requires Ω(k) queries. Both of these results are non-constructive, and they raise the question of whether we can identify an explicit func- tion for which the same lower bounds apply. Theorem 1.1 shows that for every k ≤ n, the lower bound applies to the function 2 g : x 􏰶→ x1 ⊕ · · · ⊕ xk since testing isomorphism to this function is equivalent to testing k-linearity. Testing and OBDDs. Fix some finite sets X and Y . An ordered binary decision diagram (OBDD) is a directed acyclic graph with a single root and at most n + 1 levels of nodes. The sink nodes in the last level are each associated with some element from Y . All other levels are associated with an index from {1, . . . , n}. The nodes in these levels have out-degree |X|, with one edge associated with each element of X. For x = (x1,...,xn) ∈ Xn, define f(x) to be the value of the sink node that we reach when we start at the source, then at each level we follow the edge associated with xi, where i is the index associated with the current level. The resulting function f : Xn → Y is the function computed by the OBDD. The width of an OBDD is the maximum number of nodes at any level. Ron & Tsur (2009) first studied the problems of determining the query complexity for testing whether a function is computable by small-width OBDDs. Goldreich (2010a) continued this line of research and also asked whether we can establish stronger lower bounds for the query complexity of testing properties that include a subset of the functions computable by small-width OBDDs. In particular, he showed that there is a property consisting of func- tions computable by width-2 OBDDs that requires Θ(n) queries to test. Theorem 1.1 gives a different proof of the same result, since n -linear functions are computable by width-2 OBDDs. 2 Goldreich (2010a) conjectured in that an identical lower bound also held for testing an explicit subclass of functions computable by width-3 OBDDs and for testing functions computable by width-4 OBDDs. Recently, he observed (private communication) that the method of the proof of Theorem 1.1 can also be used to prove these conjectures: 6 Blais, Brody & Matulef Theorem 1.3. Testing the class of linear functions from GF(3)n to GF(3) that have only 0-1 coefficients requires Θ(n) queries. Note that this class of functions can be trivially computed by a width-3 OBDD by maintaining a partial sum of the inputs. Theorem 1.4. Testing the class of functions that are computable by width-4 OBDDs requires Θ(n) queries. Remark 1.5. Theorem 1.4 was proved independently by Brody et al. (2011). Testing monotonicity. Fix R ⊆ R. The function f : {0, 1}n → R is monotone if for any two inputs x,y ∈ {0,1}n where x1 ≤ y1,...,xn ≤ yn, we have that f(x) ≤ f(y). The problem of test- ing monotonicity was first studied by Goldreich et al. (2000), who introduced a natural tester: sample random edges from the hyper- cube and verify that the function is monotone on those edges. This algorithm makes O(n log |R|) queries (Dodis et al. 1999). An im- portant open problem in property testing is to determine whether there exist more efficient monotonicity testers. Despite much attention to monotonicity testing (Batu et al. 1999; Bhattacharyya et al. 2009; Bri ̈et et al. 2010; Dodis et al. 1999; Ergun et al. 2000; Fischer et al. 2002; Goldreich et al. 2000), lower bounds for the query complexity of this problem have been elusive. Previously, the best bound for non-adaptive testers, due to Fischer et al. (2002), was only Ω(log n). This translates to a Ω(log log n) lower bound for general (adaptive) testers.3 We pro- vide a significant improvement to this lower bound for functions with large ranges: Theorem 1.6. Testing f : {0, 1}n → R for monotonicity requires Ω(min{n, |R|2}) queries. Notably, Theorem 1.6 gives the first progress on the natural- monotonicity-tester problem mentioned above: it shows that for 3Stronger bounds have been established for testers with one-sided error. See (Bri ̈et et al. 2010; Fischer et al. 2002) for details. Property Testing Lower Bounds 7 √n ≤ |R| ≤ poly(n), no monotonicity tester can improve on the query complexity of the natural tester by more than a logarithmic factor. We note, however, that this problem is still open in the important special case when R = {0, 1}. By a recent result of Seshadhri & Vondr ́ak (2011), Theorem 1.6 also gives a new lower bound for the query complexity of testing submodularity. Corollary 1.7. Testing f : {0,1}n → R for submodularity re- quires Ω(n) queries. Testing concise representations. Parnas, Ron & Samorodnit- sky (2002) showed that testing whether a function can be repre- sented by a monotone DNF with at most s terms can be done with a number of queries that depends only on s. This result was gen- eralized by Diakonikolas et al. (2007), who introduced the method of testing by implicit learning and showed that this method can be used to test whether a function can be represented by a DNF with few terms, by a small decision tree, by a small branching program, etc. Our technique gives lower bounds on the query complexity for testing two of these properties. Theorem 1.8. At least Ω(min{s, n − s}) queries are required to test (i) size-2s decision trees, and (ii) size-s branching programs. Testing juntas. The proof of Theorem 1.8 can also be extended to answer a question of Fischer et al. (2004): they asked if the query complexity of testing k-juntas can be reduced if the tester is only required to reject functions that are far from (k+t−2)-juntas for some t > 0. We show that the answer to this question is “no” √
foranyt≤O( k).
Theorem 1.9. Fix k ≤ n and t > 0. Any algorithm that accepts
2
k-juntas and rejects functions 1 -far from (k + t − 2)-juntas with
􏰃4 k2 􏰄
high probability must make Ω min{( t ) , k} queries.

8 Blais, Brody & Matulef
Remark 1.10. Subsequently, the result of Theorem 1.9 was es- sentially strengthened by Ron & Tsur (2011), who showed that for any constants ε and γ, a nearly linear lower bound of Ω(k/logk) queries holds even if we are only required to accept k-juntas and reject functions that are ε-far from (1 + γ)k-juntas.
Testers with one-sided error. The technique we introduce for proving new lower bounds can also be used to prove lower bounds for testers with one-sided error (that is, testers which accept func- tions with probability 1 if they have property P, and reject them with probability at least 2/3 if they are far from having property P). As a first application, we get a much stronger lower bound for the query complexity of testing decision trees with one-sided error.
Theorem 1.11. At least Ω(s) queries are required to test size-s decision trees with one-sided error.
We also obtain a lower bound on the query complexity of one- sided testers for a subclass of halfspaces, the class of “signed” ma- jority functions on k variables.
Theorem 1.12. Fix any constant γ ∈ (0, 1). For k ≤ γn, at least Ω(k/logk) queries are required to test signed k-majorities with one-sided error.
1.2. Techniques. The main idea behind all of our lower bounds is to show that the query complexity for testing a property P is bounded below by the randomized communication complexity of some well-studied communication game G. To do so, we introduce a P-testing game. The definition of this game varies according to the situation, but typically looks like this: Alice receives a boolean function f, Bob receives a boolean function g, and they must test whether the joint function h = f ⊕ g has the property P. We can then relate the number of queries required to test whether h has this property to the number of bits Alice and Bob need to communicate. Finally, we show that the P-testing game requires large communication by using it to solve G.

Property Testing Lower Bounds 9
This technique is best illustrated by example. In fact, we can give a very simple sketch of the proof of Theorem 1.1 (i), by show- ing how to reduce a version of the well-known set-disjointness problem to testing k-linearity. Suppose Alice and Bob each have sets of size k from a universe of size n. Suppose further that their sets are guaranteed to either intersect in exactly one place, or not at all, and they want to decide which is the case. We let k-disj de- note this particular setting for set-disjointness. It is well-known that the communication complexity of k-disj is Ω(k) (H ̊astad & Wigderson 2007; Kalyanasundaram & Schnitger 1992).
One way Alice and Bob can solve k-disj is by forming lin-
ear functions based on their two sets. For a set S ⊆ [n], define
ParityS to be the linear function on the bits indexed from S; i.e.,
ParityS(x) := 􏰁i∈S xi. Given input sets A and B, Alice forms
the function f = ParityA and Bob forms the function g = ParityB.
The joint function h = f ⊕g is 2k-linear if the sets do not intersect,
and (2k − 2)-linear if they do. Note that every 2k-linear function
is 1 -far from being (2k − 2)-linear (see Proposition 3.4). Therefore, 2
they can determine if their sets intersect by emulating a testing al- gorithm for (2k − 2)-linearity on h. The testing algorithm requires oracle access to h, which neither Alice nor Bob have. However, they do know f and g, so Alice and Bob can simulate oracle access to h by exchanging f(x) and g(x), at a cost of two bits of com- munication per query. The total number of bits communicated is then twice the number of queries of the tester. Since we can lower bound the number of bits communicated by Ω(k), this implies that testing (2k − 2)-linearity also requires Ω(k) queries. By scaling k, we achieve the first part of Theorem 1.1.
To summarize, our lower bound for testing k-linearity follows from three inequalities. Letting Ck-linear denote the communica-
tion game where Alice and Bob get f and g as input and wish to determine if f ⊕ g is k-linear or far from k-linear, and using Q(P) to denote the query complexity of testing for P and R(G) to denote the randomized communication complexity of a commu- nication game G, we achieve a lower bound on testing k-linearity via the following chain of inequalities:
(1.13) 2Q(k-linear) ≥ R(Ck-linear) ≥ R(k-disj) = Ω(k) . ⊕

10 Blais, Brody & Matulef
All of the testing lower bounds in this paper follow the above struc- ture. A crucial aspect of this proof technique is that emulating the property testing algorithm must be done in a communication ef- ficient manner. In the example above, the joint function h was just the xor of Alice’s and Bob’s functions, so simulating each query required only two bits of communication. For other lower bounds, we require more complicated ways to build a joint func- tions. However, as long as each h(x) can be simulated with low communication, a similar lower bound will hold. We formalize this statement in Lemma 2.4.
Note that in most situations, it is possible to use problems such as k-disj whose communication complexity is well understood, and therefore we get the equality in (1.13) essentially for free. The re- duction in the first inequality is captured by Lemma 2.4, so for most proofs, the bulk of the actual work is in proving a so-called “dis- tance lemma”—that yes instances for the communication problem map to instances where the combined function has property P, and that no instances will map to functions that are far from having P. In most cases, as in Proposition 3.4, these are simple to prove.
1.3. Organization. In Section 2, we introduce the communi- cation complexity and property testing definitions, the Main Re- duction Lemma (Lemma 2.4), and the communication complexity lower bounds that we use in the later sections.
The proofs of Theorem 1.1–1.4, as well as the formal definitions of the properties defined in those theorems, are presented in Sec- tion 3. The lower bound in Theorem 1.6 for testing monotonicity and the lower bound in Corollary 1.7 for testing submodularity are presented in Section 4. We complete the proof of Theorem 1.8 re- garding the query complexity for testing functions computable by small decision trees or by small branching programs in Section 5. In the same section, we also complete the proof of Theorem 1.9. Finally, we present the lower bounds on the query complexity of one-sided testers for decision trees (Theorem 1.11) and for unsigned k-majority functions (Theorem 1.12) in Section 6.

Property Testing Lower Bounds 11 2. From Communication Complexity to
Property Testing
2.1. Property Testing Definitions. Recall that for a fixed range R ⊆ R, a property of the functions {0, 1}n → R is a sub- set of those functions. The distance between two functions f,g : {0,1}n → R is the fraction of inputs x ∈ {0,1}n for which f(x) ̸= g(x). The distance between f and a property P is the minimum distance between f and any function g in P. When the distance fromf toP isatleastε,wesaythatf isε-far fromP.
Definition 2.1 (Tester). An (ε,q)-tester for the property P of functions {0,1}n → R is a randomized algorithm that queries a function f : {0, 1}n → R on at most q inputs from {0, 1}n and
(i) Accepts with probability at least 2 when f is in P; and 3
(ii) Rejects with probability at least 2 when f is ε-far from P. 3
A tester is said to be non-adaptive if it selects its q queries
before observing the value of f on any of those queries; otherwise
it is adaptive. A tester that always accepts functions in P has one-
sided error; a tester that accepts functions in P with probability
p for some 2 ≤ p < 1 has two-sided error. 3 For any 0 < ε < 1, the query complexity of the property P at distance ε, denoted Qε(P), is the minimum value of q for which P has an (ε,q)-tester. Similarly, Q1(P) and Qna(P) denote the εε minimum number of queries required to ε-test P with one-sided error and non-adaptive testers, respectively. Throughout this work, we will assume that ε is a small fixed constant—say, ε = 0.01 for concreteness—and for simplicity we state all query complexity bounds only in terms of the other parameters and will omit ε from the notation. 2.2. Communication Complexity Definitions. In this sub- section, we review the basic communication complexity setup and highlight some of the terms and concepts particularly relevant to this article. For more details, we refer the interested reader to the standard textbook by Kushilevitz & Nisan (1997). 12 Blais, Brody & Matulef In a typical communication game, there are two parties—Alice, who receives an input x, and Bob, who receives some input y. Al- ice and Bob wish to jointly compute some function f(x,y) of their inputs. Neither player sees all the information needed to compute f, so they must communicate together to solve the problem. Com- munication complexity is the study of how much communication is necessary to compute f, for various functions f. A protocol is a distributed algorithm that Alice and Bob use to compute f(x,y); in particular, it specifies what messages Alice and Bob send to each other. In a deterministic protocol, Alice’s messages are a function only of her input x and the previous com- munication in the protocol. Similarly, Bob’s messages are a func- tion of y and the previous communication. The cost of a protocol is the maximum (over all inputs) number of bits sent by Alice and Bob. The deterministic communication complexity of f, denoted D(f), is the minimum cost of a deterministic protocol computing f. In a randomized protocol, Alice and Bob have shared access to a (public coin) random string r ∈ {0,1}∗. We say that P is a δ-error protocol for f if for any input pair x, y, P computes f (x, y) with probability at least 1 − δ, where the probability is taken over the random string r. We use Rδ(f) to denote the minimum cost of a δ-error protocol for f and define R(f) := R1/3(f). When f is a binary function, we say that a protocol computes f with one-sided error if there exists z ∈ {0, 1} such that P computes f with certainty whenever f (x, y) ̸= z, and with probability at least 1 − δ when f (x, y) = z. When considering randomized protocols with one-sided error, it is important to note which “side” the error guarantee is on. We use Rδz(f) to denote the minimum cost of a randomized protocol for f that correctly computes f whenever f (x, y) ̸= z and computes f with probability at least 1−δ whenever f(x,y) = z. We define Rz(f) := Rz (f). 1/3 A protocol is one-way if the communication consists of a sin- gle message from Alice to Bob, who then outputs an answer. We use Rδ→(f) to denote the minimum communication cost of a ran- domized, δ-error, one-way protocol for f. Finally, we use R→,z(f) δ to denote the minimum communication cost of randomized one- Property Testing Lower Bounds 13 way protocols for f with with one-sided error δ, and we define R→,z(f) := R→,z(f). 1/3 2.3. The Main Reduction. Below we define a class of property testing communication games and show how communication lower bounds for these games yield query complexity lower bounds for property testers. Our communication games are based on what we call combining operators. Definition 2.2 (Combining operator). A combining operator is an operator ψ that takes as input two functions f, g : {0, 1}n → Z and returns a function h : {0, 1}n → R. We refer to the inputs f and g as the base functions of ψ. By con- vention, we use h to refer to the output of ψ. Given a combining operator ψ and a property P, we define CPψ to be the following property testing communication game. Alice receives f. Bob re- ceives a function g. They need to compute 􏰀1 ifψ(f,g)∈P CPψ(f,g) := 0 if ψ(f,g) is ε-far from P. We prove all of our testing lower bounds by reducing from an associated communication game CPψ . As mentioned in Section 1.2, thisreductionissimple—AliceandBobsolveCPψ byemulatingaP- testing algorithm on h := ψ(f, g). Note that neither Alice nor Bob have enough information to evaluate a query h(x), because h de- pends on both f and g. Instead, they must communicate to jointly compute h(x). For this reduction to give a strong query complexity lower bound for the property testing problem, it is essential that the joint computation of h(x) occurs in a communication-efficient manner. The following definition gives a sufficient condition on combin- ing operators that yield strong reductions to testing problems. Definition 2.3 (Simple combining operator). A combining oper- ator ψ is simple if for all f, g, and for all x, the query h(x) can be computed given only x and the queries f(x) and g(x). 14 Blais, Brody & Matulef For example, when the base functions are boolean, the combin- ing operator defined by ψ(f, g) := f ⊕ g is clearly simple—each h(x) = f (x) ⊕ g(x) can trivially be computed from f (x) and g(x). On the other hand, the combining operator ψ that returns the function defined by h(x) := 􏰁y∈T [f (y) · g(y)] is not simple when T is a large set of strings (say a Hamming ball centered at x), since computing h(x) requires knowledge of f(y) and g(y) for several y. All of the property testing communication games we use in this paper are based on simple combining operators and give us a tight connection between property testing and communication complexity via the following lemma. Lemma 2.4 (Main Reduction Lemma). Fix Z to be a finite set. For any simple combining operator ψ with base functions f,g : {0, 1}n → Z and any property P, we have (i) R(CPψ)≤2Q(P)·⌈log|Z|⌉, (ii) R0(CPψ)≤2Q1(P)·⌈log|Z|⌉, (iii) R→(CPψ ) ≤ Qna(P) · ⌈log |Z|⌉, and (iv) R→,0(CPψ ) ≤ Qna,1(P) · ⌈log |Z|⌉. Proof. We begin by proving (iii). Let A be a q-query non- adaptive tester for P. We create a one-way protocol P for CPψ in the following manner. Alice and Bob use public randomness to gener- ate queries x(1), . . . , x(q). Then, Alice computes f(x(1)), . . . , f(x(q)) and sends them to Bob in a single (q · ⌈log |Z|⌉)-bit message. For each i, Bob computes g(x(i)) and combines it with f(x(i)) to compute h(x(i)). Finally, Bob emulates A using the responses h(x(1)), . . . , h(x(q)) and outputs 1 if and only if A accepts h. If A has two-sided error, then by the correctness of A, P computes CPψ with probability at least 2/3. Hence, R→(CPψ) ≤ q · ⌈log |Z|⌉. In particular, if A is an optimal non-adaptive tester with two-sided error, then q = Qna(P), and part (iii) of the lemma is proved. If A has one-sided error, then whenever h ∈ P, the protocol P correctly outputs 1, and when h is ε-far from P, the proto- col correctly outputs 0 with probability at least 2/3. Therefore, Property Testing Lower Bounds 15 R→,0(CPψ ) ≤ q · ⌈log |Z|⌉. In particular, when A is an optimal non- adaptive tester with one-sided error, R→,0(CPψ) ≤ Q1(P)·⌈log|Z|⌉. Now, suppose A is a q-query adaptive tester for P. Again, Alice and Bob use public randomness to generate queries x(1),...,x(q). However, since A is adaptive, the distribution of the ith query x(i) depends on h(x(j)) for all j < i. Instead of generating all queries in advance, Alice and Bob generate queries one at a time. Each time a query x(i) is generated, Alice and Bob exchange f(x(i)) and g(x(i)). Since ψ is a simple combining operator, this is enough information for Alice and Bob to individually compute h(x(i)), which in turn gives them enough information to generate the next query with the appropriate distribution. When h(x(1)),...,h(x(q)) have all been computed, Bob outputs 1 if and only if A accepts h. This protocol costs 2q · ⌈log |Z|⌉ bits of communication, and if A is an optimal adaptive tester, then R(CPψ ) ≤ 2 Q(P) · ⌈log |Z|⌉. Similarly, if A is an optimal adaptive tester with one-sided error, then R0(CPψ ) ≤ 2 Q1(P) · ⌈log |Z|⌉. 􏰮 2.4. Communication Complexity Problems. We achieve all of our testing lower bounds via Lemma 2.4. To prove lower bounds for CPψ , we reduce from one of several standard communication complexity problems. However, we often require special flavors of these problems—either we need protocols with one-sided error, or we require the input to be restricted in some balanced way. We describe the variants that we will need for our reductions in this section. Let n ∈ N,t := t(n), and x,y ∈ {0,1}n. We use ◦ to denote string concatenation and 0k (1k) to denote the string of k consec- utive zeros (ones). Let x ⊕ y denote the bitwise exclusive-or of x and y. We use 1 − x to denote the bitwise complement of x. The Hamming weight of a string x, denoted |x|, is the number of i such that xi = 1. The Hamming distance between strings x and y, denoted ∆(x, y), is the number of coordinates i such that xi ̸= yi. Note that ∆(x, y) = |x ⊕ y|. We are interested in the following functions: 16 Blais, Brody & Matulef Set-Disjointness. Alice and Bob are given n-bit strings x and y respectively and wish to compute n disjn(x, y) := 􏰘 xi ∧ yi . i=1 Equivalently, Alice and Bob’s inputs can be viewed as sets A, B ⊆ [n]. In this case, disj(A, B) = 1 if and only if their sets intersect. When n is clear from context, we drop the subscript. A cele- brated result of Kalyanasundaram & Schnitger (1992), later simpli- fied by Razborov (1990) and Bar-Yossef et al. (2002), showed that R(disjn) = Ω(n), even under the promise that A and B intersect in at most one element. Theorem 2.5 (Kalyanasundaram & Schnitger 1992). R(disjn) = Ω(n). We use a balanced version of disjointness called k-bal-disj. In this version, Alice receives a set A ⊆ [n] of size |A| = ⌊k/2⌋ + 1, Bob receives a set B ⊆ [n] of size ⌈k/2⌉ + 1, and there is a promise that |A ∩ B| ≤ 1. Lemma 2.6. For all 0 ≤ k ≤ n − 2, we have R(k-bal-disj) = Ω (min{k, n − k}). Proof. If n−k = O(1), there is nothing to prove. Otherwise, let m := min{⌊k/2⌋ + 1, n − k − 2}. We reduce from disjm. Partition the elements of [n]\[m] into sets I := {m+1,...,m+1+⌊k/2⌋} and J := {m + 2 + ⌊k/2⌋,...,n}. Note that |I| = ⌊k/2⌋ + 1. Furthermore, we have |J| ≥ ⌈k/2⌉ + 1, since |J| = n−(m+2+⌊k/2⌋)+1 = n−1−m−⌊k/2⌋ = n−1−m+⌈k/2⌉−k = ⌈k/2⌉+1+n−2−m−k ≥ ⌈k/2⌉+1, Property Testing Lower Bounds 17 where the penultimate equality holds because k = ⌊k/2⌋ + ⌈k/2⌉, and the inequality comes from the fact that m ≤ n − k − 2. Let A′ and B′ be the sets received by Alice and Bob respectively as inputs to disjm. Alice pads her input with elements from I until she gets a set of size ⌊k/2⌋ + 1. Bob similarly pads his input with elements from J. Let a := ⌊k/2⌋+1−|A′| and b := ⌈k/2⌉+1−|B′|. Specifically, Alice sets A = A′ ∪ {m + 1,...,m + a} and Bob sets B = B′ ∪ {n, n − 1, . . . , n − b + 1}. Note that |A| = ⌊k/2⌋+1, |B| = ⌈k/2⌉+1, and A∩B = A′ ∩ B′. Therefore, a solution to k-bal-disj(A, B) gives a solution to disjm(A′, B′), hence R(k-bal-disj) ≥ R(disjm) = Ω(m) = Ω(min{k, n − k}) . 􏰮 Gap-Equality. Alice and Bob are given n-bit strings x and y respectively and wish to compute 1 ifx=y,  geqn,t(x,y) := 0 if ∆(x,y) = t, ∗ otherwise. We drop the subscripts when n is clear from context and t = n/8. When geq(x, y) = ∗, we allow the protocol to output 0 or 1. We are interested in Rz(geq); recall that Rz(geq) is the minimum communication cost of a protocol for geq that only makes mis- takes when geq(x,y) = z. The standard public-coin equality protocol gives R0(geq) = O(1). For protocols that only err when geq(x, y) = 1, the complexity is drastically different. Buhrman et al. (1998) proved an Ω(n) lower bound on the de- terministic communication complexity of geqn,n/2; their result ex- tends to other gap sizes and to randomized protocols with one-sided error. Lemma 2.7 (Buhrman et al. 1998). For all even t = Θ(n), we have R1(geqn,t) = Ω(n).4 4Curiously, the parity of t turns out to be necessary. Since ∆(x,y) = |x| + |y| − 2|x ∧ y|, Alice and Bob can deterministically distinguish x = y from 18 Blais, Brody & Matulef Gap-Hamming-Distance. Alice and Bob are given n-bit strings x and y respectively and wish to compute 1 if∆(x,y)≥n/2+t,  ghdn,t(x,y):= 0 if∆(x,y)≤n/2−t, ∗ otherwise. As in the definition of set-disjointness, it will occasionally be useful to view inputs to ghd as sets A,B ⊆ [n] and to express ghd in terms of the size of the symmetric difference |A∆B| rather than Hamming distance ∆(x, y). The standard gap size for ghd is t = √n. In this case, we drop the subscripts and use just ghd. A tight lower bound of R(ghd) = Ω(n) is known, due to Chakrabarti & Regev (2011). Theorem 2.8 (Chakrabarti & Regev 2011). R(ghd) = Ω(n). For larger gap sizes, a padding argument5 implicit in Brody et al. (2010), together with the aforementioned Ω(n) bound for ghd, shows that R(ghdn,t) = Ω((n/t)2) for all t = Ω(√n). When we require one-sided error, the situation changes. Lemma 2.9. For all z ∈ {0,1} and all constant 0 < δ < 1/2, Rz(ghdn,δn) = Ω(n). Proof. First, we prove a lower bound for R0(ghdn,δn). Let d be the least integer greater than or equal to δn, and let m := n/2 + d. We reduce from geqm,2d. Specifically, let P be a protocol for ghdn,δn that only makes errors when ghdn,δn(x,y) = 0. We use it to construct a protocol Q for geqm,2d that makes mistakes ∆(x,y) being odd with a single bit of communication—Alice sends Bob the parity of |x|, and Bob computes the parity of |x| + |y|. This does not affect our property testing lower bounds. √ 5This padding argument reduces ghdn,√n to ghdn,t for any n < t ≤ O(n). Choose nˆ such that n = (nˆ/t(nˆ))2, and let m := nˆ/n. Then, given inputs x,y ∈ {0,1}n, Alice and Bob can compute ghdn,√n by repeating each string m times. The resulting strings xˆ and yˆ have length nˆ, and ghdnˆ,t(nˆ)(xˆ,yˆ) = ghdn,√n(x, y), hence a protocol for the former can be used to solve the latter. It follows that R(ghdnˆ,t(nˆ)) ≥ R(ghdn,√n) = Ω(n) = Ω((nˆ/t(nˆ))2). Property Testing Lower Bounds 19 only when x = y. Given inputs x,y ∈ {0,1}m, Alice constructs xˆ := x ◦ 0n−m, and Bob builds yˆ := y ◦ 1n−m. Then, they run protocol P and output Q(x,y) := 1 − P(xˆ,yˆ). Note that if x = y, then ∆(xˆ, yˆ) = n/2 − d ≤ n/2 − δn, hence ghd(xˆ, yˆ) = 0. Simi- larly, when ∆(x,y) = 2d, we have ∆(xˆ,yˆ)=n/2+d≥n/2+δn, and ghd(xˆ, yˆ) = 1. In either case, the new protocol correctly out- puts geqm,2d(x,y) whenever P correctly computes ghdn,δn(xˆ,yˆ). Since P only makes mistakes when ghdn,δn(xˆ,yˆ) = 0, it follows that Q only makes mistakes when geqm,2d(x,y) = 1. Therefore, R0(ghdn,δn) ≥ R1(geqm,2d) = Ω(m) = Ω(n). Next, we prove a lower bound for R1(ghdn,δn). Observe that ghdn,δn(x, y) = 1−ghdn,δn(x, 1−y). Therefore, Alice and Bob can build a protocol for ghdn,δn which errs only when ghdn,δn(x, y) = 0 from one which errs only when ghdn,δn(x,y) = 1 by comput- ing ghdn,δn(x, 1 − y) and inverting the output. It follows that R1(ghdn,δn) ≥ R0(ghdn,δn) = Ω(n). In this way, we get a lower bound for Rz(ghdn,δn) by embedding an instance of geq into either side of the ghd problem. 􏰮 We also consider an extended version of ghd. In eghdn,k,t, Alice and Bob’s inputs x, y are n-bit strings, with the promise that |x| = |y| = k, and they wish to distinguish ∆(x,y) ≥ k + t from ∆(x, y) ≤ k − t. Lemma 2.10. For all k, t ≤ n/2, we have R(eghdn,k,t) = Ω(min{(k/t)2, k}) . In particular, we show that ghdn,t remains hard even under the promise that |x| = |y| = n/2. Proof. First, we prove the lemma for the case k = n/2 by reduction from ghdn/2,t/2. Let P be a protocol for eghdn,n/2,t. Given inputs xˆ, yˆ ∈ {0, 1}n/2 to ghdn/2,t/2, Alice and Bob create n-bit strings x, y by mapping each bit 0 → 01 and each bit 1 → 10.6 Then, they run protocol P on input (x,y) and output the result. 6Formally, Alice creates an n-bit string x by setting x2i−1 := xˆi and x2i := 1 − xˆi for all 1 ≤ i ≤ n/2. Similarly, Bob defines y by setting y2i−1 := yˆi and y2i :=1−yˆi. 20 Blais, Brody & Matulef Note that |x| = |y| = n/2. Furthermore, ∆(x, y) = 2∆(xˆ, yˆ). Therefore, if ∆(xˆ,yˆ) ≥ n/4+t/2 then ∆(x,y) ≥ n/2+t, and simi- larly if ∆(xˆ, yˆ) ≤ n/4−t/2 then ∆(x, y) ≤ n/2−t. It follows that a correct answer for eghdn.n/2,t gives a correct answer to ghdn/2,t/2, hence R(eghdn,n/2,t) ≥ R(ghdn/2,t/2) = Ω 􏰃min 􏰷(n/t)2, n􏰸􏰄 = Ω 􏰃min 􏰷(k/t)2, k􏰸􏰄 . Proving the general case follows from a simple padding argument. Specifically, we reduce eghd2k,k,t to eghdn,k,t. Given 2k-bit strings xˆ and yˆ, Alice and Bob construct n-bit strings x and y by setting x := xˆ◦0n−2k and y := yˆ◦0n−2k. It is easy to see that |x| = |y| = k and that ∆(x,y) = ∆(xˆ,yˆ). Therefore, an answer to eghdn,k,t(x, y) gives an answer to eghd2k,k,t(xˆ, yˆ), hence R(eghdn,k,t) ≥ R(eghd2k,k,t) = Ω 􏰃min{(k/t)2, k}􏰄 . 􏰮 3. Testing k-Linearity and Related Properties In this section we prove Theorem 1.1. Recall that a k-linear func- tion is a function of the form f(x) = 􏰂i∈S xi (mod 2) for some set S ⊆ [n] of size |S| = k. We use k-linear to denote the property that a function is k-linear. The definitions of the other properties in the statement of Theorem 1.1 are as follows: Definition 3.1 (Junta). The function f : {0, 1}n → {0, 1} is a k-junta if there is a set J ⊆ [n] of size |J| ≤ k such that for every x,y∈{0,1}n thatsatisfyxi =yi foreachi∈J,wehavef(x)= f(y). We use k-junta to denote the property that a function is a k-junta. Definition 3.2 (Low Fourier degree). For convenience when dis- cussing Fourier degree we will represent boolean functions using range {−1, 1} instead of {0, 1}. It is well known that every boolean function f : {0, 1}n → {−1, 1} has a unique representation of 􏰂 ˆ 􏰂 xi the form f(x) = S⊆[n] f(S)χS(x), where χS = (−1) i∈S and Property Testing Lower Bounds 21 f(S) ∈ R. The terms f(S) are the Fourier coefficients of f, and ˆˆ the Fourier degree of f is the maximum value of k ≥ 0 such that ˆ7 f(S)̸=0forsomesetSofsize|S|=k. Weusedegree-kto denote the property that the Fourier degree of a function is at most k. Definition 3.3 (Sparse polynomials). Every boolean function f : {0, 1}n → {0, 1} also has a unique representation as a polynomial over GF(2). We say that f is a k-sparse polynomial if its represen- tation over GF(2) has at most k terms. Let k-sparse denote the property that a function has a k-sparse GF(2) representation. The following facts about k-linear functions will be used in the proof of Theorem 1.1: Proposition 3.4. Fix n > 2 and 1 ≤ k ≤ n − 2. If f : {0, 1}n → {0, 1} is (k + 2)-linear, then f is
(i) 1 -far from k-linear functions, 2
(ii) 1-farfromk-juntas, 2
(iii) 1-far from functions of Fourier degree at most k, and 2
(iv) 1 -far from k-sparse polynomials. 20
Proof. We first prove part (iii). Parts (i) and (ii) will follow immediately from part (iii) and the observation that k-juntas and k-linear functions have Fourier degree at most k.
Let f be a (k + 2)-linear function over the variables of some set
T ⊆ [n] of size |T| = k+2, and let g be any function of Fourier
degree at most k. For convenience, we will represent f and g as
functions from {0, 1}n to {−1, 1}. Since f is a linear function over ˆˆ
the variables in T, we know that f(T) = 1, and f(S) = 0 for all S ̸= T. Moreover, since g has Fourier degree k and |T| > k, we know by definition that gˆ(T ) = 0. Thus by Parseval’s theorem
􏰕ˆ
f(S)gˆ(S) = 0 ,
S ⊆[n]
7For more details on the Fourier representation of boolean functions see, e.g., O’Donnell 2008; de Wolf 2008.
E[f(x)g(x)] = x

22 Blais, Brody & Matulef
which implies Prx[f(x) ̸= g(x)] = 1/2.
Finally, part (iv) is a special case of a more general theorem of
Diakonikolas et al. (2007, Thm. 36). For convenience, we provide a self-contained proof as Lemma A.1 in Appendix A. 􏰮
Theorem 1.1 (Restated). Fix 1 < k < n − 1. Then, at least Ω(min{k, n − k}) queries are required to test (i) k-linear functions, (ii) k-juntas, (iii) functions of Fourier degree at most k, and (iv) functions with k-sparse polynomial representation in GF(2). Proof. We prove the lower bound for k-linear functions by re- ducing from the k-bal-disj problem. Recall that Ck-linear is the communication game where the inputs are the functions f,g : {0,1}n → {0,1} and the players must test whether the function h = f ⊕ g is k-linear. Lemma 2.4 and 2.6 imply that 2 Q(k-linear) ≥ R(Ck-linear) ⊕ and To complete the proof, we show that R(Ck-linear) ≥ R(k-bal-disj) R(k-bal-disj) = Ω(min{k, n − k}) . ⊕ with a reduction from k-bal-disj to Ck-linear. ⊕ Let A,B ⊆ [n] be the two sets of size |A| = ⌊k⌋+1 and 2 |B| = ⌈k⌉+1 received by Alice and by Bob, respectively, as the in- 2 put to an instance of k-bal-disj. Alice and Bob can construct the functions ParityA, ParityB : {0, 1}n → {0, 1}. When |A ∩ B| = 1, the symmetric difference of the two sets has size |A△B| = |A| + |B|−2|A∩B| = k, and the function ParityA⊕ParityB = ParityA△B is k-linear. Conversely, when A and B are disjoint, the function ParityA ⊕ ParityB is a (k+2)-parity function and, by Proposi- tion 3.4, it is 1-far from k-linear functions. So Alice and Bob can 2 solve their instance of k-bal-disj with a communication protocol for Ck-linear. This implies that R(Ck-linear) ≥ R(k-bal-disj), as ⊕ ⊕⊕ we wanted to show. The same reduction from k-bal-disj yields lower bounds for testing the other properties. Let Ck-junta denote the communica- ⊕ tion game where Alice and Bob receive boolean functions f and g Property Testing Lower Bounds 23 as inputs and must decide if h = f ⊕ g is a k-junta. Similarly, de- fine Cdegree-k and Ck-sparse to be the corresponding communication ⊕⊕ games where Alice and Bob must decide if h has Fourier degree at most k or can be represented by a k-sparse GF(2) polynomial. In the k-bal-disj reduction above, Alice and Bob create a joint func- tion ParityA△B that is (k + 2)-linear if A and B are disjoint, and k-linear if A and B intersect. By Proposition 3.4, (k + 2)-linear functions are 1 -far from k-juntas, 1 -far from functions with Fourier 22 degree at most k, and 1 -far from k-sparse polynomials. Recalling 20 that a k-linear function is a k-junta, has Fourier degree at most k, and can be represented by a k-sparse GF(2) polynomial, it follows that R(k-bal-disj) ≤ min{R(Ck-junta),R(Cdegree-k),R(Ck-sparse)}. ⊕⊕⊕ Together with Lemma 2.4 and 2.6, we have 2 Q(k-junta) ≥ R(Ck-junta) = Ω(min{k, n − k}) , 2 Q(degree-k) ≥ R(Cdegree-k) = Ω(min{k, n − k}) , and ⊕ 2 Q(k-sparse) ≥ R(Ck-sparse) = Ω(min{k, n − k}) . 􏰮 ⊕ Testing linear functions with 0-1 coefficients. With Oded Goldreich’s kind permission, we present his proof of Theorem 1.3. Theorem 1.3 (Restated). Testing the class of linear functions from GF(3)n to GF(3) that have only 0-1 coefficients requires Θ(n) queries. Proof. The upper bound in the theorem follows from the query complexity of learning subclasses of linear functions over GF(3)n. See Goldreich (2010a) for the details. For the lower bound, we establish a reduction from the set- disjointness problem. Let C{0,1}-Lin be the communication game where Alice and Bob receive the functions f, g : GF(3)n → GF(3), respectively, and must determine if the function h = f + g (where the sum is taken pointwise over GF(3)) is linear and has only {0, 1}- coefficients. By Lemma 2.4 and Theorem 2.5, 4 Q({0,1}-Lin) ≥ R(C{0,1}-Lin) and R(disj) = Ω(n) . + ⊕ + 24 Blais, Brody & Matulef To complete the proof, it suffices to show that R(C{0,1}-Lin) ≥ R(disj). We do so with a reduction from set-disjointness to the {0,1}-Lin testing communication game. Let a, b ∈ {0, 1}n be the strings received by Alice and Bob as input to the set-disjointness problem. Alice and Bob build the functions f,g : GF(3)n → GF(3) defined by f(x) = 􏰂ni=1 aixi and g(x) = 􏰂ni=1 bixi, respectively. The combined function h = f +g is defined by h(x) = 􏰂ni=1(ai + bi)xi. This function is clearly linear. When a and b are disjoint, then every coefficient ai + bi of h takes value 0 or 1. Conversely, when a and b are not disjoint, there is an index i for which ai + bi = 2. Then for any linear function l with {0, 1}-valued coefficients, the function h − l is a non-zero linear function. The Schwartz-Zippel Lemma states that every non-zero linear function over GF(3)n takes the value 0 on at most 1 of the inputs from GF(3)n. Thus, h is 2-far from all the linear 33 functions with only {0, 1}-valued coefficients. Therefore, Alice and Bob can run a protocol for C{0,1}-Lin to solve their instance of set- disjointness and R(C{0,1}-Lin) ≥ R(disj). 􏰮 + Testing computability by width-4 OBDDs. Again with the kind permission of Oded Goldreich, we present his proof of Theo- rem 1.4. We again remark that this result was obtained indepen- dently by Brody et al. (2011), who gave a similar proof. The hard instances we use in this proof are those introduced by Goldreich (2010a). Let nˆ := ⌊n−1⌋. We develop base functions 4 from the following primitive functions. Consider the following four- bit functions φ0, φ1, φ2, φ3 : {0, 1}4 → {0, 1}: φ0(x1, x2, x3, x4) := 0 , φ1(x1,x2,x3,x4):=x1x3 , φ2(x1,x2,x3,x4):=x2x4 , φ3(x1, x2, x3, x4) := x1x3 ⊕ x2x4 . Given z = (z1,...,znˆ) ∈ {0,1,2,3}nˆ, define the function hz : {0, 1}n → {0, 1} by setting nˆ hz(x1,...,xn):=x1 ⊕􏰕φzj(x4j−2,x4j−1,x4j,x4j+1). + + j=1 Property Testing Lower Bounds 25 Lemma 3.5 (Goldreich 2010a, Thm. 4.2). Fix z ∈ {0, 1, 2, 3}nˆ . If every coordinate j ∈ [nˆ] of z satisfies zj ∈ {0, 1, 2} then hz can be computed by a width-4 OBDD. Otherwise, if there exists j ∈ [nˆ] such that zj = 3 then hz is 1 -far from all functions computable 16 by width-4 OBDDs. Goldreich (2010a) uses Lemma 3.5 to show that testing com- putability by width-4 OBDDs requires Ω(√n) queries. We combine this construction with our technique and get an Ω(n) lower bound. Theorem 1.4 (Restated). Testing the class of functions that are computable by width-4 OBDDs requires Θ(n) queries. Proof. We reduce from disjnˆ . Given sets A, B ⊆ [nˆ], Alice and Bobfirstbuildstringsa,b∈{0,1,2,3}nˆ bysetting 􏰀1 ifj∈A 􏰀2 ifj∈B aj := 0 otherwise and bj := 0 otherwise Alice and Bob then define functions f := ha and g := hb. We define the combining operator ψ(f,g) to return the function hz, where zj := aj + bj . By our choice of base functions, for every x ∈ {0, 1}n we have hz(x) = ha(x) ⊕ hb(x) ⊕ x1, so ψ is a simple combining operator. Also, our definitions of a, b, and z imply that zj = 3 iff j ∈ A ∩ B. By Lemma 3.5, this means that hz is computable by a width-4 OBDD when A and B is disjoint and hz is 1 -far 16 from all functions computable by width-4 OBDDs when A and B intersect. Thus, R(C4-OBDD) ≥ R(disjnˆ). This inequality, together ψ with Lemma 2.4, Theorem 2.5, and the fact that nˆ = Θ(n), yields 2 Q(4-OBDD) ≥ R(C4-OBDD) ≥ R(disj ) = Ω(n) . 􏰮 ψ nˆ 4. Testing Monotonicity and Submodularity Fix R ⊆ R. Recall that the function f : {0, 1}n → R is monotone if for any two inputs x,y ∈ {0,1}n that satisfy x1 ≤ y1,...,xn ≤ yn, we have f(x) ≤ f(y). In this section, we prove the following lower bound on the query complexity for testing monotonicity. 26 Blais, Brody & Matulef Theorem Theorem 1.6 (Restated). Testing f : {0, 1}n → R for monotonicity requires Ω(min{n, |R|2}) queries. Proof. We first consider the case where R = R. We prove the lower bound for testing monotonicity in this case with a reduction from set-disjointness. Let ψ be the combining operator that, given two functions f, g : {0, 1}n → {−1, 1}, returns the function h : {0,1}n → Z defined by h(x) := 2|x| + f(x) + g(x). Define Cmono be the communication game where Alice and Bob are given ψ two functions f, g : {0, 1}n → {−1, 1} and they must test whether h is monotone. By Lemma 2.4 and Theorem 2.5, 2 Q(mono) ≥ R(Cmono) and R(disj) = Ω(n) . ψ We complete the proof by showing that R(Cmono) ≥ R(disj). ψ Let A,B ⊆ [n] be the subsets received by Alice and Bob as the input to an instance of the set-disjointness problem. Alice and Bob build functions χA, χB : {0, 1}n → {−1, 1}, respectively, by setting χA(x) = (−1)􏰂i∈A xi and χB(x) = (−1)􏰂i∈B xi. Let h:=ψ(χA,χB). Notethath(x)=2|x|+χA(x)+χB(x).Weclaim that (a) when A and B are disjoint, h is monotone, and (b) when A and B are not disjoint, h is 1 -far from monotone. If this claim is 8 true, then we have completed our lower bound since it implies that Alice and Bob can run a protocol for Cmono to solve their instance of set-disjointness and, therefore, R(Cmono) ≥ R(disj). ψ We now prove Claim (a). Fix i ∈ [n]. For x ∈ {0,1}n, let x0, x1 ∈ {0, 1}n be the vectors obtained by fixing the ith coordinate of x to 0 and to 1, respectively. Note that for any set S ⊆ [n], χS(x1) = −χS(x0) if i ∈ S, and χS(x1) = χS(x0) otherwise. So when i ∈/ A and i ∈/ B, h(x1) − h(x0) = 2 |x1| − 2 |x0| = 2 > 0 , when i ∈ A and i ∈/ B,
h(x1) − h(x0) = 2 |x1| − 2 |x0| − 2 χA(x0) ≥ 0 , and when i ∈/ A and i ∈ B,
h(x1 ) − h(x0 ) = 2 |x1 | − 2 |x0 | − 2 χB (x0 ) ≥ 0 .
ψ

Property Testing Lower Bounds 27
Those three inequalities imply that when i ̸∈ A∩B, the function h is monotone on each edge (x0, x1) in the ith direction. As a result, when A and B are disjoint the function h is monotone.
Let us now prove Claim (b). Let A∩B ̸= ∅. When i ∈ A∩B, h(x1 ) − h(x0 ) = 2 |x1 | − 2 |x0 | − 2 χA (x0 ) − 2 χB (x0 ).
This implies that for each x that satisfy χA(x0) = χB(x0) = 1,
it holds that h(x1 ) < h(x0 ). Partition {0, 1}n into 2n−1 pairs that form the endpoints to all the edges in the ith direction. We claim that at least 1 of these pairs satisfy the condition χA(x0) = 4 χB(x0) = 1. To see this, note that when A = {i}, then χA(x0) = 1 with certainty. On the other hand, if j ∈ A for some j ̸= i, then take any x0, and let xˆ0 be x0 with the jth bit flipped. Then, χA(x0) = −χA(xˆ0). It follows that χA(x0) = 1 with prob- ability exactly 1. A similar argument holds independently for 2 χB(x0). Therefore, at least 1 of these pairs will satisfy the con- 4 dition χA(x0) = χB(x0) = 1, and for each of these pairs, either h(x0) or h(x1) must be modified to make h monotone. Therefore, when A and B intersect, we need to modify at least 2n/8 entries, just to correct the violated edges in the ith direction. It follows that h is at least 1 -far from monotone. 8 Suppose now that |R| ≥ 12√n + 5. Without loss of generality assume R⊇{n−6√n−2,...,n+6√n+2}.8 As before, we do a reduction from set-disjointness. Alice and Bob receive sets A, B ⊆ [n], respectively, and build the functions χA, χB : {0, 1}n → {−1, 1}. The modification to the construction is in the definition of the combining operator. We now define ψ(χA,χB) to return the 8This essentially boils down to a renaming of R. Formally, we prove a lower bound for testing fˆ : {0, 1}n → Rˆ for an arbitrary Rˆ with |Rˆ| ≥ 12√n + 5 by reducing from the problem of testing f : {0,1}n → R. Let φ : R → Rˆ be the bijection that maps the ith least element of R to the ith least element of Rˆ, ˆˆˆ and define f(x) := φ(f(x)). f is monotone if f is monotone, and f is ε-far from monotone if f is ε-far from monotone. Thus, testing fˆ for monotonicity has the same query complexity as testing f for monotonicity. 28 Blais, Brody & Matulef function h′ defined by h(x) if n − 3√n ≤ |x| ≤ n + 3√n , √2√2 h′(x):= n−6 n−2 if|x|n +3 n. 2
Note that when n −3√n ≤ |x| ≤ n +3√n, we have n−6√n−2 ≤ √22′
h(x) ≤ n + 6 n + 2. The definition of h takes strings with low Hamming weight and “rounds h(x) up” to n − 6√n − 2. In the same way, it takes strings of high Hamming weight and “rounds h(x) down” to n + 6√n + 2. We claim that this preserves mono- tonicity when h is monotone, while ensuring that when h is far from monotone, our new function h′ remains reasonably far from monotone.
Claim 4.1. If A and B are disjoint, then h′ is monotone. If A and B intersect, then h′ is 1 -far from monotone.
16
Proof. Suppose that A ∩ B = ∅. We proceed in a manner
similar to the general case. Fix any i ∈ [n], and for x ∈ {0, 1}n, let
x0 and x1 be the vectors obtained by setting the ith bit to 0 and 1
respectively. If A and B are disjoint, then h is monotone. When
|x0| and |x1| both lie in the range {n − 3√n,…, n + 3√n}, then 22
h′(x1) = h(x1) ≥ h(x0) = h′(x0), so monotonicity is preserved. If |x0| and |x1| are either both less than n −3√n or both greater than
n√2′
2 + 3 n, then monotonicity is trivially preserved, as h is constant
on each of these ranges. It remains to show that monotonicity is
preserved when |x0| and |x1| lie in different ranges. But h′(x) =
h(x) ∈ {n−6√n−2,…,n+6√n+2} for all x in this middle
range. Therefore, h′(x) ≤ h′(y) ≤ h′(z) for all x,y,z such that
|x|n +3√n. 222
If A ∩ B ̸= ∅ then h is 1/8-far from monotone. We claim that h′ is at most 1/16-far from h. To see this, let x ∈ {0,1}n be a uniform random string. By the Chernoff Bound,
Pr 􏰥􏰅􏰅|x| − n 􏰅􏰅 > 3√n􏰦 < 0.03 < 1/16 . 2 Furthermore, h′(x) = h(x) whenever n − 3√n ≤ |x| ≤ n + 3√n, hence Pr[h′(x) ̸= h(x)] ≤ 1/16. 22 Property Testing Lower Bounds 29 Suppose for the sake of contradiction that h′ is not 1 -far from 16 monotone, and let h ̃ denote the monotone function closest to h′. By the triangle inequality and the fact that h′ is at most 1/16-far from h, we have Pr[h(x) ̸= h ̃(x)] ≤ Pr[h(x) ̸= h′(x)] + Pr[h′(x) ̸= h ̃(x)] < 1/8 . This violates the fact that h is 1/8-far from monotone. 􏰮 It is possible to make the above argument much tighter, and get the corresponding linear query complexity lower bound for a smaller range of |R|, although |R| remains Ω(√n). We chose the above range size to maximize clarity. Finally, suppose that |R| = o(√n). We prove the lower bound for testing monotonicity in this case via a reduction from the |R| ≥ 12√n + 5 case. Let A be an optimal monotonicity testing algorithm for functions f : {0, 1}n → R, and let m be the greatest integer such that |R| ≥ 12√m + 5. We will use A to construct a monotonicity testing algorithm A′ for functions g : {0, 1}m → R. The construction of A′ depends on the following claim. Claim 4.2. Given a function g : {0, 1}m → R, there exists a function h : {0, 1}n → R with the following properties: (i) If g is monotone, then h is monotone. (ii) If g is ε-far from monotone, then h is ε-far from monotone. (iii) For all x ∈ {0, 1}n, the value of h(x) can be deter- mined with one query to g. Proof. We construct h : {0,1}n → R from g by padding. Specifically, define h(x, y) := g(x) for strings x ∈ {0, 1}m and y ∈ {0,1}n−m. This construction clearly satisfies the first and third conditions of the claim. For the second condition, we prove the contrapositive. Suppose that h is not ε-far from monotone. Let h ̃ be the monotone function closest to h; thus, Prx,y[h ̃(x,y) ̸= h(x, y)] < ε. By an averaging argument, there exists y ̃ ∈ {0, 1}n−m such that Prx[h ̃(x, y ̃) ̸= h(x, y ̃)] < ε. Define g ̃ : {0, 1}m → R 30 Blais, Brody & Matulef as g ̃(x) := h ̃(x,y ̃). Then g ̃ is monotone and Prx[g ̃(x) ̸= g(x)] = Prx[h ̃(x, y ̃) ̸= h(x, y ̃)] < ε, so g is not ε-far from monotone. 􏰮 The construction of A′ is simple. Given input g : {0,1}m → R, let h be the function guaranteed by Claim 4.2. A′ runs A on h and accepts if and only if A accepts h. By Claim 4.2, if g is monotone, then h is monotone, and if g is ε-far from monotone, then h is ε-far from monotone. The correctness of A′ then follows from the correctness of A. Furthermore, by Claim 4.2, A′ uses one query per query of A. However, testing g : {0,1}m → R for monotonicity requires Ω(m) queries, because g has range size |R| ≥ 12√m + 5. Since A is an optimal monotonicity tester and uses the same number of queries as A′, it follows that testing f : {0,1}n → R for monotonicity requires Ω(m) = Ω(|R|2) queries when |R| = o(√n). 􏰮 Testing submodularity. The function f : {0, 1}n → R is sub- modular if for every x, y ∈ {0, 1}n, f(x∨y)+f(x∧y) ≥ f(x)+f(y), where(x∨y)i =max{xi,yi}and(x∧y)i =min{xi,yi}. Testing submodularity was first studied by Parnas, Ron & Rubinfeld (2003) for functions over rectangles. Seshadhri & Vondra ́k (2011) initiated the study of submodularity testing for functions over the boolean hypercube and, in particular, showed that testing submodularity is at least as difficult as testing monotonicity. Specifically, they established the following result. Lemma 4.3 (Seshadhri & Vondra ́k 2011). Given the function f : {0, 1}n → R, there exists a function g : {0, 1}n+1 → R with the following properties: (i) If f is monotone, then g is submodular. (ii) If f is ε-far from monotone, then g is ε -far from submodular. 2 (iii) For each x ∈ {0, 1}n+1, the value of g(x) can be determined with 2 queries to f. Combining the lemma with the lower bound of Fischer et al. (2002) on testing monotonicity yields a lower bound of Ω(logn) Property Testing Lower Bounds 31 queries for testing submodularity non-adaptively. This implies a weak lower bound of Ω(log log n) queries for general (i.e., adaptive) submodularity testers. Combining Lemma 4.3 with Theorem 1.6 instead, we get a stronger lower bound. Corollary 1.7 (Restated). Testing f : {0, 1}n → R for submod- ularity requires Ω(n) queries. Proof. Consider the task of testing whether f : {0, 1}n−1 → R is monotone. Let g : {0,1}n → R be the corresponding function whose existence is guaranteed by Lemma 4.3. We can test whether f is monotone by simulating a submodularity tester T on g. If T makes q queries, the resulting monotonicity tester makes a total of 2q queries. By Theorem 1.6, all monotonicity testers must make at least Ω(n) queries, so our submodularity tester must also make q = Ω(n) queries. 􏰮 5. Testing Concise Representations We begin with formal definitions for decision trees and branching programs. Definition 5.1 (Decision tree). A decision tree is a directed bi- nary tree in which each internal node is labelled with some element from [n], the two edges going out of an internal node are labelled with 0 and 1, and each leaf node has a label from {0, 1}. The deci- sion tree D computes the function f : {0, 1}n → {0, 1} if for every x ∈ {0,1}n, the path defined in D by querying the value of xi at each internal node labelled with i and following the corresponding edge leads to a leaf node with value f(x). The size of a decision tree is the total number of leaves it contains. Definition 5.2 (Branching program). A branching program is a directed acyclic graph with two sink nodes labelled with 0 and 1, respectively, and where all other nodes have out-degree 2. Each non-sink node has a label from [n] and the two edges leaving a node are labelled with 0 and 1, respectively. The branching program P computes the function f : {0, 1}n → {0, 1} if each x ∈ {0, 1}n 32 Blais, Brody & Matulef defines a path in the branching program that leads to the sink labelled with f(x). The size of a branching program is the total number of nodes it contains. The proof of Theorem 1.8 relies on the following two simple lemmas. Lemma 5.3. Fixs≥1and0<α<1. Letf :{0,1}n →{0,1}be an s-linear function. Then f can be computed by a decision tree of size 2s and is 1−α -far from all functions that are computable by 2 decision trees of size at most α 2s. Proof. To construct a decision tree of size 2s that computes the function f : x 􏰶→ xi1 ⊕ ··· ⊕ xis, create a complete tree of depth s where each node at level j of the tree queries xij . This tree has 2s leaves and, by setting the value of each leaf appropriately, computes the function f exactly. Consider now a decision tree T of size at most α2s, and let g : {0,1}n → {0,1} be the function computed by this tree. We want to show that Pr[f(x) ̸= g(x)] ≥ 1−α when the probability is 2 over the uniform distribution of x ∈ {0,1}n. For each leaf l of T, let depth(l) denote the number of unique variables queried by the nodes in the path from the root of T to l and let Rl ⊆ {0,1}n represent the set of inputs x ∈ {0,1}n that define a path in T that reaches l. (Note that the sets Rl form a partition of {0, 1}n.) Define B := 􏰒l : depth(l) 0. The lower bound of Chockler & Gutfreund (2004) gives a lower bound of Ω(k/t) queries for this task. (See also (Diakonikolas et al. 2007, App. E).) This bound is not sufficiently strong to answer Fischer et al.’s question for any t = ω(1).

The following result shows that for any t ≤ O( k), the task
of distinguishing k-juntas from functions that are far from (k + t)- juntas requires (asymptotically) as many queries as the standard k-junta testing problem.
Theorem 1.9 (Restated). Fix k ≤ n and t > 0. Any algorithm 2
that accepts k-juntas and rejects functions far from (k + t − 2)- juntas with high probability must make Ω􏰃 min{(k )2, k}􏰄 queries.
t
Proof. We prove the theorem with a reduction from the ex-
tended Gap Hamming Distance problem. If t = Ω(k), there is
nothing to prove. Otherwise, suppose t = o(k), and let C(k,t)-junta ⊕
be the communication game where Alice and Bob receive the func- tions f, g : {0, 1}n → {0, 1} and must distinguish between the case where h = f ⊕g is a k-junta from the case where h is far from (k + t − 2)-juntas. By Lemma 2.4 and 2.10 and the fact that k + t/2 = Θ(k), we have
2 Q􏰃(k, t)-junta􏰄 ≥ R(C(k,t)-junta) ⊕
and
To complete the proof, we want to show that R(C(k,t)-junta) ≥
R(eghdn,2k+t, t ). 2
R(eghdn,k+ t , t ) = Ω􏰃 min{(k )2, k}􏰄 . 22t
Let x and y be the strings received by Alice and Bob, respec- tively, as input to an instance of the eghdn,k+ t , t problem. Alice
22
and Bob then construct the functions ParityA, ParityB : {0, 1}n → {0,1},whereA:={i:xi =1}andB:={i:yi =1}. Thefunc- tion h = ParityA ⊕ ParityB = ParityA△B is |A△B|-linear. When ∆(x, y) = |A△B| ≤ k + t/2 − t/2 = k, the function h is a k-parity

36 Blais, Brody & Matulef
function. Conversely, when ∆(x, y) ≥ k + t/2 + t/2 = k + t, by
Proposition 3.4(ii), the function h is 1 -far from all (k + t − 2)- 2
juntas. Therefore, Alice and Bob can run a protocol for the game C(k,t)-junta to solve their instance of the Extended Gap Hamming
Distance problem and, as we wanted to show, R(C(k,t)-junta) ≥ ⊕

R(eghdn,k+ t , t ). 22
􏰮
Remark 5.5. The conference version of this paper (Blais et al. 2011) used a different argument to prove Theorem 1.8 and 1.9. During the review process, a flaw was found in that argument. For a retraction of the earlier version of Theorem 1.8 and a discussion of the error, see (Blais et al. 2012).
6. Testers with One-Sided Error
Testing decision trees. We saw in the last section that Ω(log s) queries are required to test whether a function can be represented as a boolean decision tree with at most s nodes. For testers with one-sided error, we get an exponentially larger bound.
Theorem 1.11 (Restated). At least Ω(s) queries are required to test size-s decision trees with one-sided error.
Proof. We first consider the case where s = 2n−1 for some n ≥
5. We prove this case with a reduction from the gap-equality
problem on s-bit strings. Let Cs-DT be the communication game ⊕
where Alice and Bob receive the functions f, g : {0, 1}n → {0, 1} and they must test whether the function h = f ⊕ g is computable by a decision tree of size s. By Lemma 2.4 and 2.7,
2 Q1􏰃s-DT􏰄 ≥ R1(Cs-DT) and R1(geq s ) = Ω􏰃s􏰄 . ⊕ s,8
We complete the proof by showing that R1(Cs-DT) ≥ R1(geq s ). ⊕ s,8
Let a,b ∈ {0,1}s be received by Alice and Bob as input to
an instance of the gap-equality problem. They must determine
if a = b or whether ∆(a,b) = s. Alice and Bob can solve their 8
instance of the geq problem with the following protocol. Let the set of vectors x ∈ {0,1}n with even parity Parity(x) = x1 ⊕ ··· ⊕

and
􏰀bx 1
Property Testing Lower Bounds 37
xn = 0 define an indexing of the bits of a. (I.e., fix a bijection between those strings and [s].) Alice and Bob build the functions f,g : {0,1}n → {0,1} by setting
􏰀ax 0
a decision tree of size at most 152n; when it can, they answer
when Parity(x) = 0, when Parity(x) = 1,
f(x) =
g(x) =
Alice and Bob then test whether f ⊕ g can be represented with
when Parity(x) = 0, when Parity(x) = 1.
∆(a, b) = s . 8
16
Let us verify the correctness of this protocol. For any x ∈
{0, 1}n where Parity(x) = 0, we have that (f ⊕ g)(x) = ax ⊕ bx.
Furthermore, for each x where Parity(x) = 1, we get (f ⊕g)(x) = 1.
So when a = b, then f ⊕ g is the Parity function. By Lemma 5.3,
this function is 1 -far from every decision tree of size at most 152n. 32 16
When ∆(a, b) = s , consider the (complete) tree that computes f ⊕g 8
by querying xi in every node at level i. This tree has 2n leaves, but
for every input x where ax ̸= bx, we have that the corresponding
leaf has the same value as its sibling. So for each such input, we
can eliminate one leaf. Therefore, we can compute f ⊕ g with a
decision tree of size at most 2n − 2n−1/8 < 15 2n. 􏰮 16 Testing signed k-majorities. Our next bound is for testing whether a function f : {−1, 1}n → {−1, 1} is a signed k-majority (for convenience, in this section we will switch notation and rep- resent boolean values with ±1 notation). A signed majority is a majority function with some variables negated, i.e. it is a half- space of the form f(x) = sgn(w · x), where w ∈ {−1, 1}n. If w ∈ {−1,0,1}n and exactly k of the wi’s are non-zero, we say it is a signed k-majority. (A signed majority function is thus also a signed n-majority function.) Signed majorities were studied by Matulef et al. (2009), who referred to them as {−1,1}-weight halfspaces. In that work, they showed a non-adaptive lower bound of Ω(logn) queries to test 38 Blais, Brody & Matulef whether a function is a signed majority on all n variables. Blais & O’Donnell (2010) studied the related problem of testing whether a function is a (non-signed) majority on exactly k out of n variables. When k ≤ 3n, they showed a lower bound of Ω(k1/12) queries for 4 non-adaptive algorithms with two-sided error. We show that Ω(k/ log k) queries are required to test whether f is a signed k-majority with one-sided error. The argument of Blais & O’Donnell (2010) can be adapted to show a non-adaptive, two- sided lower bound of Ω(k1/12) queries for this problem as well. Our bound is incomparable; it is asymptotically stronger and applies to adaptive algorithms, but only ones with one-sided error. The proof of our result relies on the following lemma. Lemma6.1. Foreveryα>0,thereexistk0 ∈Nandε>0such that for every k ≥ k0 and k′ ≥ (1 + α)k, all signed k′-majorities are ε-far from signed k-majorities.
We defer the proof of Lemma 6.1 to Appendix B. We are now ready to complete the proof of Theorem 1.12.
Theorem 1.12 (Restated). Fix any constant γ ∈ (0, 1). For any k ≤ γn, testing signed k-majorities with one-sided error requires at least Ω(k/ log k) queries.
Proof. We again use a reduction from the gap-equality prob- lem. Let k′ = k . Let ψ be a combining operator that takes
1−γ
functions f,g : {−1,1}n → {−k′,−k′ + 1,…,k′} and returns the
function h defined by h(x) := sgn􏰃 f (x)+g(x) 􏰄. Define Ck-maj 2ψ
to be the communication game where Alice and Bob receive functions f,g and must test if h = ψ(f,g) is a signed k-majority function.
By Lemma 2.4 and 2.7,
log(2k′ + 1) · Q1􏰃k-maj􏰄 ≥ R1(Ck-maj)
and
We complete the proof by showing that R1(Ck-maj) ≥ R1(geq ′
R1(geqk′,γk′ ) = Ω􏰃k′􏰄 . ψ
The theorem then follows by noting that Ω( k′ log k′
k ,γk
) = Ω( k ). log k
′ ).
ψ

Property Testing Lower Bounds 39
Leta,b∈{0,1}k′ bereceivedbyAliceandBob,respectively,as
the input to an instance of the geqk′,γk′ problem. Alice and Bob
generate the functions f,g : {−1,1}n → {−k,−k + 1,…,k} by
setting f(x) := 􏰂k′ (−1)aixi and g(x) := 􏰂k′ (−1)bixi, respec- i=1 i=1
tively. Notethatf(x)+g(x)=􏰂k′ 􏰃(−1)ai +(−1)bi􏰄xi,sohcan i=1
now be written as h(x) = sgn(w · x), where 1 if ai = bi = 0 ,

wi = 0 if ai ̸= bi , −1 ifai=bi=1.
When a = b, h is a signed k′-majority function. Lemma 6.1, shows
that signed k′-majority functions are a constant distance from all
signed k-majority functions. When ∆(a, b) = γk′, then ai = bi for
exactly k indices i ∈ [k′] and h is a signed k-majority function.
So Alice and Bob can solve their instance of the gap-equality
problem with a protocol for the Ck-maj game and, as we wanted to ψ
show, R1(Ck-maj) ≥ R1(geq ′ ′ ). 􏰮 ψ k ,γk
Acknowledgements
We thank Oded Goldreich for communicating the proof of The- orem 1.3 and 1.4 with us and for giving us permission to include these proofs. We also thank Sourav Chakraborty, David Garc ́ıa So- riano, and Arie Matsliah for sharing an early manuscript version of (Chakraborty et al. 2011a) with us. In addition, E.B. wishes to thank Ryan O’Donnell for several helpful discussions during the course of this research.
We thank Amit Weinstein, Tom Gur, and the anonymous ref- erees for insightful feedback on an earlier draft of this article. We offer special thanks to the referee who pointed out an error in our earlier version of Theorem 1.8 and to the referee who communi- cated the simplified proof of Lemma 2.10.
A preliminary version of this article appeared in pages 210–220 of the proceedings of the 26th Annual IEEE Conference on Compu- tational Complexity, 2011. Much of this work was performed while the second author was a postdoctoral researcher at Tsinghua Uni- versity Institute for Interdisciplinary Information Sciences (IIIS).

40 Blais, Brody & Matulef
This work was supported in part by the National Basic Research Program of China Grant 2007CB807900, 2007CB807901, and the National Natural Science Foundation of China Grant 61033001, 61061130540, 61073174.
The second author also acknowledges support from the Danish National Research Foundation and The National Science Founda- tion of China (under the grant 61061130540) for the Sino-Danish Center for the Theory of Interactive Computation and from the CFEM research center (supported by the Danish Strategic Re- search Council) within which part of this work was performed.
References
Noga Alon & Eric Blais (2010). Testing Boolean Function Isomor- phism. In Proc. 14th International Workshop on Randomization and Approximation Techniques in Computer Science, 394–405.
Ziv Bar-Yossef, T. S. Jayram, Ravi Kumar & D. Sivakumar (2002). An Information Statistics Approach to Data Stream and Com- munication Complexity. In Proc. 43rd Annual IEEE Symposium on Foundations of Computer Science, 209–218.
Tugkan Batu, Ronitt Rubinfeld & Patrick White (1999). Fast Approximate PCPs for Multidimensional Bin-Packing Problems. In Proc. 3rd International Workshop on Randomization and Approxima- tion Techniques in Computer Science, 245–256.
Arnab Bhattacharyya, Elena Grigorescu, Kyomin Jung, So- fya Raskhodnikova & David P. Woodruff (2009). Transitive- Closure Spanners of the Hypercube and the Hypergrid. Technical Re- port TR09-046, ECCC.
Eric Blais (2008). Improved Bounds for Testing Juntas. In Proc. 12th International Workshop on Randomization and Approximation Tech- niques in Computer Science, 317–330.
Eric Blais (2009). Testing Juntas Nearly Optimally. In Proc. 41st Annual ACM Symposium on the Theory of Computing, 151–158.
Eric Blais, Joshua Brody & Kevin Matulef (2011). Property Testing Lower Bounds via Communication Complexity. In Proc. 26th Annual IEEE Conference on Computational Complexity.

Property Testing Lower Bounds 41 Eric Blais, Joshua Brody & Kevin Matulef (2012). Erratum to
Property Testing Lower Bounds via Communication Complexity.
Eric Blais & Daniel Kane (2011). Testing Properties of Linear Functions. Manuscript.
Eric Blais & Ryan O’Donnell (2010). Lower Bounds for Test- ing Function Isomorphism. In Proc. 25th Annual IEEE Conference on Computational Complexity, 235–246.
Manuel Blum, Michael Luby & Ronitt Rubinfeld (1993). Self- testing/correcting with applications to numerical problems. J. Comput. Syst. Sci. 47, 549–595. Earlier version in STOC’90.
Jop Brie ̈t, Sourav Chakraborty, David Garc ́ıa Soriano & Arie Matsliah (2010). Monotonicity Testing and Shortest-Path Rout- ing on the Cube. In Proc. 14th International Workshop on Randomiza- tion and Approximation Techniques in Computer Science.
Joshua Brody, Amit Chakrabarti, Oded Regev, Thomas Vidick & Ronald de Wolf (2010). Better Gap-Hamming Lower Bounds via Better Round Elimination. In Proc. 14th International Workshop on Randomization and Approximation Techniques in Com- puter Science.
Joshua Brody, Kevin Matulef & Chenggang Wu (2011). Lower Bounds for Testing Computability by Small-Width Branching Pro- grams. In Proc. 8th Annual Theory and Applications of Models of Com- putation.
Harry Buhrman, Richard Cleve & Avi Wigderson (1998). Quan- tum vs. classical communication and computation. In Proc. 30th Annual ACM Symposium on the Theory of Computing, 63–68.
Amit Chakrabarti & Oded Regev (2011). An Optimal Lower Bound on the Communication Complexity of Gap-Hamming-Distance. In Proc. 43rd Annual ACM Symposium on the Theory of Computing.
Sourav Chakraborty, David Garc ́ıa Soriano & Arie Matsliah (2011a). Efficient Sample Extractors for Juntas with Applications. In Proc. 38th International Colloquium on Automata, Languages and Pro- gramming.

42 Blais, Brody & Matulef
Sourav Chakraborty, David Garc ́ıa Soriano & Arie Matsliah (2011b). Nearly tight bounds for testing function isomorphism. In Proc. 22nd Annual ACM-SIAM Symposium on Discrete Algorithms.
Hana Chockler & Dan Gutfreund (2004). A lower bound for testing juntas. Information Processing Letters 90(6), 301–305.
Ilias Diakonikolas, Homin Lee, Kevin Matulef, Krzysztof Onak, Ronitt Rubinfeld, Rocco Servedio & Andrew Wan (2007). Testing for Concise Representations. In Proc. 48th Annual IEEE Symposium on Foundations of Computer Science, 549–558.
Yevgeniy Dodis, Oded Goldreich, Eric Lehman, Sofya Raskhodnikova, Dana Ron & Alex Samorodnitsky (1999). Im- proved testing algorithms for monotonicity. In Proc. 3rd International Workshop on Randomization and Approximation Techniques in Com- puter Science, 97–108.
Funda Ergun, Sampath Kannan, Ravi Kumar, Ronitt Ruben- feld & Mahesh Viswanathan (2000). Spot-Checkers. J. Comput. Syst. Sci. 60, 717–751.
W. Feller (1968). An introduction to probability theory and its appli- cations, volume 2. John Wiley & Sons.
Eldar Fischer, Guy Kindler, Dana Ron, Shmuel Safra & Alex Samorodnitsky (2004). Testing juntas. J. Comput. Syst. Sci. 68, 753–787.
Eldar Fischer, Eric Lehman, Ilan Newman, Sofya Raskhod- nikova, Ronitt Rubinfeld & Alex Samorodnitsky (2002). Mono- tonicity testing over general poset domains. In Proc. 34th Annual ACM Symposium on the Theory of Computing, 474–483.
Oded Goldreich (2010a). On Testing Computability by Small Width OBDDs. In Proc. 14th International Workshop on Randomization and Approximation Techniques in Computer Science, 574–587.
Oded Goldreich (editor) (2010b). Property Testing: Current Re- search and Surveys, volume 6390 of LNCS. Springer.
Oded Goldreich, Shafi Goldwasser, Eric Lehman, Dana Ron & Alex Samorodnitsky (2000). Testing monotonicity. Combinator- ica 20(3), 301–337.

Property Testing Lower Bounds 43
Oded Goldreich, Shafi Goldwasser & Dana Ron (1998). Prop- erty Testing and its Connection to Learning and Approximation. J. ACM 45(4), 653–750.
Johan H ̊astad & Avi Wigderson (2007). The Randomized Commu- nication Complexity of Set Disjointness. Theory of Computing 211–219.
Piotr Indyk & David Woodruff (2003). Tight Lower Bounds for the Distinct Elements Problem. In Proc. 45th Annual IEEE Symposium on Foundations of Computer Science, 283–289.
Bala Kalyanasundaram & Georg Schnitger (1992). The Proba- bilistic Communication Complexity of Set Intersection. SIAM J. Disc. Math. 5(4), 547–557.
Eyal Kushilevitz & Noam Nisan (1997). Communication Complex- ity. Cambridge University Press, Cambridge.
Kevin Matulef, Ryan O’Donnell, Ronitt Rubinfeld & Rocco Servedio (2009). Testing {-1,1}-weight halfspaces. In Proc. 13th Inter- national Workshop on Randomization and Approximation Techniques in Computer Science.
Peter Bro Miltersen, Noam Nisan, Shmuel Safra & Avi Wigderson (1995). On data structures and asymmetric communica- tion complexity. In Proc. 27th Annual ACM Symposium on the Theory of Computing, 103–111.
Ryan O’Donnell (2008). Some topics in analysis of boolean function. In Proc. 40th Annual ACM Symposium on the Theory of Computing, 569–578.
Michal Parnas, Dana Ron & Ronitt Rubinfeld (2003). On Test- ing Convexity and Submodularity. SIAM J. Comput. 32(5), 1158–1184.
Michal Parnas, Dana Ron & Alex Samorodnitsky (2002). Test- ing basic boolean formulae. SIAM J. Disc. Math. 16(1), 20–46.
Alexander Razborov (1990). On the Distributional Complexity of Disjointness. In Proc. 17thInternational Colloquium on Automata, Lan- guages and Programming, 249–253.
Dana Ron (2008). Property Testing: A Learning Theory Perspective. Foundations and Trends in Machine Learning 1(3), 307–402.

44 Blais, Brody & Matulef
Dana Ron (2009). Algorithmic and Analysis Techniques in Property Testing. Foundations and Trends in Theoretical Computer Science 5(2), 73–205.
Dana Ron & Gilad Tsur (2009). Testing Computability by Width Two OBDDs. In Proc. 13th International Workshop on Randomization and Approximation Techniques in Computer Science, 686–699.
Dana Ron & Gilad Tsur (2011). On Approximating the Number of Relevant Variables in a Function. In Proc. 15thInternational Workshop on Randomization and Approximation Techniques in Computer Science, 676–687.
Ronitt Rubinfeld & Madhu Sudan (1996). Robust characteriza- tions of polynomials with applications to program testing. SIAM J. Comput. 25, 252–271.
C. Seshadhri & Jan Vondra ́k (2011). Is submodularity testable? In Proc. 2nd Innovations in Computer Science.
Ronald de Wolf (2008). A Brief Introduction to Fourier Analysis on the Boolean Cube. Theory of Computing, Graduate Surveys 1, 1–20.
A. Distance from sparse polynomials
In this section we provide a self-contained proof that (k + 2)-linear functions are far from k-sparse polynomials. This lemma is a spe- cial case of Diakonikolas et al. (2007, Theorem 36). By considering only a special case of the theorem of Diakonikolas et al. (2007), we obtain a slightly stronger bound on the distance of (k + 2)-linear functions to k-sparse polynomials but the proof itself is essentially identical. We include the proof here primarily for completeness.
Lemma A.1 (Diakonikolas et al. 2007). Every (k + 2)-linear func- tion is 1 -far from a k-sparse polynomial over GF(2).
20
Proof. Let f be a (k + 2)-linear function, and without loss of generality assume f is a linear function on the first k + 2 variables, i.e. f(x) = x1 ⊕ · · · ⊕ xk+2. Let g be a k-sparse polynomial, i.e. g=T1⊕···⊕Tk whereeachTi isamonomial.Wewanttoshow

Property Testing Lower Bounds 45
that f and g are far. We can assume without loss of generality that g does not contain any length-1 terms, since if it did we could just subtract those terms off of both f and g to create f′ and g′, which have the same distance from each other. We could then prove the theorem for f′,g′, and a smaller value of k.
Define the influence of a variable xi in f, denoted Infi(f), in the standard way- i.e. Infi (f ) = Prx [f (x) ̸= f (x⊕i )] where x⊕i denotes x with the ith bit flipped. Define the total influence of f to be 􏰂i Infi(f).
For any f and g, it is straightforward to show that if for some i the difference |Infi(f) − Infi(g)| is at least δ, then f and g must have distance at least δ/2. When f is the (k + 2)-linear function defined above, each variable x1 through xk+2 has influence 1. Thus, to complete the proof, we will show that in g one of these variables must have influence at most 0.9.
If the total influence of x1 through xk+2 in g is less than 0.9(k + 2), then we are done, since the pigeonhole principle implies the existence of a variable xi with influence at most 0.9. Thus, in what follows, we assume
k+2
(A.2) 􏰕 Infi(g) ≥ 0.9(k + 2) . i
We can bound the total influence of x1 through xk+2 in g as follows. First, we write g = g2 ⊕ g3 where g2 is the collection of terms in g that have length 2, and g3 is the collection of terms in g that have length at least 3. Now note:
◦ Each variable xi that appears in g2 has Infi(g2) = 1/2. The reason is because since every term of g2 has length 2, xi is influential exactly when the other variables it appears with have parity 1, which happens exactly half the time.
◦ For each term in g3, the total contribution of that term to
the influences of all the variables is at most 3/4. To see why,
suppose the term has length m, then on a random assignment
the probability that a variable is relevant to that term is 1 , 2m−1
so the total effect the term can have on all the influences is
atmostm· 1 . Ifm≥3,thisisatmost3/4. 2m−1

46 Blais, Brody & Matulef
Let R2 be the number of terms of g2, and R3 be the number of terms in g3. By hypothesis, R2 + R3 ≤ k. Since each term of g2 contributes at most 1 to the total influence of g, and each term of g3 contributes at most 3/4 to the total influence of g, we have that
k+2
(A.3) 􏰕 Infi(g) ≤ R2 + (3/4)R3 .
i
Combining equations (A.2) and (A.3) we get that R2 + (3/4)R3 ≥ (9/10)k. Using the fact that R2 + R3 ≤ k, this implies that R3 ≤ (4/10)k, in other words there cannot be too many terms of length 3 or more in g. Now we can bound the influence of variables x1 through xk+2 in g.
k+2 k+2
􏰕 Infi(g) ≤ 􏰕[Infi(g2) + Infi(g3)]
ii
k+2 n
≤ 􏰕 Infi(g2) + 􏰕 Infi(g3) ii
≤ 1(k + 2) + 3 · R3 24
≤1(k+2)+3· 4 ·k 2 410
< 0.9(k + 2) . By the pigeonhole principle, there must exist a variable xi with influence at most 0.9 in g. 􏰮 B. Distance between majority functions We complete the proof of Lemma 6.1 in this section. A key ingre- dient in this proof is the Berry-Esseen theorem, a version of the Central Limit Theorem with error bounds (see e.g. Feller 1968): Theorem B.1 (Berry-Esseen). Let l(x) = c1x1 + · · · + cnxn be a linear form over the random ±1 bits xi. Assume |ci| ≤ τ for all i and write σ = 􏰛􏰂 c2i . Write F for the c.d.f. of l(x)/σ; i.e., Property Testing Lower Bounds 47 F(t) = Pr[l(x)/σ ≤ t]. Then for all t ∈ R, |F(t) − Φ(t)| ≤ O(τ/σ) · 1 , 1+|t|3 where Φ denotes the c.d.f. of X, a standard Gaussian random vari- able. In particular, if A ⊆ R is any interval then |Pr[l(x)/σ ∈ A] − Pr[X ∈ A]| ≤ C1(τ/σ), where C1 is an absolute constant. Lemma 6.1 (Restated). For every α > 0, there exist k0 ∈ N and ε>0suchthatforeveryk≥k0 andk′ ≥(1+α)k,allsigned k′-majorities are ε-far from signed k-majorities.
Proof. Let f be a signed k-majority, and g be a signed k′-
majority. It is easy to see that f and g have minimum distance
when they have the same sign pattern on their common variables.
So without loss of generality, assume f (x) = sgn(x1 + · · · + xk ) and
g(x) = sgn(x1 + · · · + xk′ ) (in other words, f is a majority function
on the first k variables, and g is a majority function on the first k′
variables). To simplify, we will write S(x) = 􏰂ki=1 xi and T(x) =
􏰂k′ xi. Thus, f(x) = sgn(S(x)) and g(x) = sgn(S(x) + T(x)). i=k+1
For any positive real number t, we have
Pr[f(x) ̸= g(x)] ≥ Pr[S(x) ∈ [0,t) and T(x) < −t] xx = Pr[S(x) ∈ [0, t)] · Pr[T (x) < −t] , xx where the equality follows from the fact that S and T are functions on disjoint sets of variables. Note that S is a linear form on k variables, so we can use the √ Berry-Esseen theorem on S with σ = k to get √√ Pr[S(x) ∈ [0, t)] ≥ (Φ(t/ k) − Φ(0)) − C1/ k x √√ ≥(Φ(t/ k)−1/2)−C1/ k , (B.2) where C1 is the constant from the Berry-Esseen theorem. Similarly, T is a linear form on αk variables, so we can use the Berry-Esseen theorem on T with σ = √ √√ αk to get (B.3) Pr[T(x) < −t] ≥ Φ(−t/ αk) − C1/ αk . x 48 Blais, Brody & Matulef √ Setting t to be, say, k, and then choosing k large enough insures that the quantities in both ((B.2)) and ((B.3)) are positive, and bigger than a constant which only depends on α. 􏰮 Manuscript received June 7, 2011 Eric Blais Computer Science Department Carnegie Mellon University Pittsburgh, PA, USA 15213 eblais@cs.cmu.edu Kevin Matulef IIIS, Tsinghua University matulef@gmail.com Joshua Brody Computer Science Department Aarhus University Aarhus, Denmark joshua.e.brody@gmail.com