




Study with the several resources on Docsity
Earn points by helping other students or get them with a premium plan
Prepare for your exams
Study with the several resources on Docsity
Earn points to download
Earn points by helping other students or get them with a premium plan
Community
Ask the community for help and clear up your study doubts
Discover the best universities in your country according to Docsity users
Free resources
Download our free guides on studying techniques, anxiety management strategies, and thesis advice from Docsity tutors
A lecture note from carnegie mellon university (cmu) on the topic of boolean functions, specifically focusing on linearity and fourier expansion. The lecture, given by ryan o’donnell, covers the definitions of linearity for boolean functions, approximate linearity, and testing linearity using the blr test. The document also introduces the concept of fourier expansion and the use of parity functions in analyzing boolean functions.
Typology: Slides
1 / 8
This page cannot be seen from the preview
Don't miss anything!
Analysis of Boolean Functions (CMU 18-859S, Spring 2007)
Jan. 18, 2005
Lecturer: Ryan O’Donnell Scribe: Ryan O’Donnell
What does it mean for a boolean function to be linear? For the question to make sense, we must
have a notion of adding two binary strings. So let’s take
f : { 0 , 1 }
n → { 0 , 1 }, and treat { 0 , 1 } as F 2.
Now there are two well-known classical notions of being linear:
Definition 1.
(1) f is linear iff f (x + y) = f (x) + f (y) for all x, y ∈ { 0 , 1 }
n .
(2) f is linear iff there are some a 1 ,... , an ∈ F 2 such that f (x 1 ,... , xn) = a 1 x 1 + · · · + anxn
⇔ there is some S ⊆ [n] such that f (x) =
i∈S xi_._
(Sometimes in (2) one allows an additive constant; we won’t, calling such functions affine .)
Since these definitions sound equally good we may hope that they’re equivalent; happily, they
are. Now (2) ⇒ (1) is easy:
(2) ⇒ (1) : f (x + y) =
i∈S
(x + y)i =
i∈S
xi +
i∈S
yi = f (x) + f (y).
But (1) ⇒ (2) is a bit more interesting. The easiest proof:
(1) ⇒ (2) : Define αi = f (
ei ︷ ︸︸ ︷
0 ,... , 0 , 1 , 0 ,... , 0). Now repeated use of
condition 1 implies f (x
1
2
n ) = f (x
1 ) + · · · + f (x
n ), so indeed
f ((x 1 ,... , xn)) = f (
xiei) =
xif (ei) =
αixi.
Nothing in this world is perfect, so let’s ask: What does it mean for f to be approximately linear?
Here are the natural first two ideas:
Definition 1.
(
′ ) f is approximately linear if f (x + y) = f (x) + f (y) for most pairs x, y ∈ { 0 , 1 }
n .
(
′ ) f is approximately linear if there is some S ⊆ [n] such that f (x) =
i∈S
xi for most
x ∈ { 0 , 1 }
n .
Are these two equivalent? It’s easy to see that (
′ ) ⇒ (
′ ) still essentially holds: If f has the
right value for both x and y (which happens for most pairs), the equation in the (2) ⇒ (1) proof
holds up.
The reverse implication is not clear: Take any linear function and mess up its values on
e 1 ,... , en. Now f (x + y) = f (x) + f (y) still holds whenever x and y are not ei’s, which is
true for almost all pairs. But now the equation in the (1) ⇒ (2) proof is going to be wrong for very
many x’s. So this proof doesn’t work — but actually our f does satisfy (
′ ), so maybe a different
proof will work.
We will investigate this shortly, but let’s first decide on (
′ ) as our official definition:
Definition 1.3 f, g : { 0 , 1 }
n → { 0 , 1 } are ≤-close if they agree on a (1 − ≤) -fraction of the inputs
{ 0 , 1 }
n
. Otherwise they are ≤-far_._
Definition 1.4 f is ≤-close to having property P if there is some g with property P such that f and
g are ≤ -close.
A “property” here can really just be any collection of functions. For our current discussion, P is
the set of 2
n linear functions.
Given that we’ve settled on definition (
′ ), why worry about definition (
′ )? Imagine someone
hands you some black-box software f that is supposed to compute some linear function, and your
job is to test it — i.e., try to identify bugs. You can’t be sure f is perfect unless you “query” its
value 2
n times, but perhaps you can become convinced f is ≤-close to being linear with many fewer
queries.
If you knew which linear function f was supposed to be close to, you could just check it on
O(1/≤) many random values — if you found no mistakes, you’d be quite convinced f was ≤-close
to linear.
Do the same for all 2
n linear (Parity) functions:
χ∅ =
, χ{ 1 } =
,... , χ[n] =
Notation: χS is Parity on the coordinates in set S; [n] = { 1 , 2 ,... , n}.
Now it’s easy the closest Parity to f is the physically closest vector.
f
χ S 1
χ S 2
χ S 3
f is closest to χS 1
It’s extra-convenient if we replace 0 and 1 with 1 and − 1 ; then the dot product of two vec-
tors measures their closeness (the bigger the dot product, the closer). This motivates the Great
Notational Switch we’ll use 99% of the time.
Great Notational Switch: 0/False → +1, 1/True → − 1.
We think of +1 and − 1 here as real numbers. In particular, we now have:
Addition (mod 2) → Multiplication (in R).
We now write:
A generic boolean function: f : {− 1 , 1 }
n → {− 1 , 1 }.
The Parity on bits S function, χS : {− 1 , 1 }
n → {− 1 , 1 }:
χS (x) =
i∈S
xi.
We now have:
Fact 2.1 The dot product of f and χS , as vectors in {− 1 , 1 }
2 n , equals
(# x ’s such that f (x) = χS (x) ) − (# x ’s such that f (x) 6 = χS (x) ).
Definition 2.2 For any f, g : {− 1 , 1 }
n → R , we write
〈f, g〉 =
2 n^
( dot product of f and g as vectors )
= avg x∈{− 1 , 1 }n
[f (x)g(x)] = E x∈{− 1 , 1 }n
[f (x)g(x)].
We also call this the correlation of f and g
1 .
Fact 2.3 If f and g are boolean-valued, f, g : {− 1 , 1 }
n → {− 1 , 1 } , then 〈f, g〉 ∈ [− 1 , 1]. Further,
f and g are ≤ -close iff 〈f, g〉 ≥ 1 − 2 ≤.
Now in our linearity testing problem, given f : {− 1 , 1 }
n → {− 1 , 1 } we are interested in the
Parity function having maximum correlation with f. Let’s give notation for these correlations:
Definition 2.4 For S ⊆ [n] , we write
f^ ˆ (S) = 〈f, χ S 〉
Now with the switch to − 1 and 1 , something interesting happens with the 2
n Parity functions;
they become orthogonal vectors:
Proposition 2.5 If S 6 = T then χS and χT are orthogonal; i.e., 〈χS , χT 〉 = 0_._
Proof: Let i ∈ S∆T (the symmetric difference of these sets); without loss of generality, say
i ∈ S \ T. Pair up all n-bit strings: (x, x
(i) , where x
(i) denotes x with the ith bit flipped.
Now the vectors χS and χT look like this on “coordinates” x and x
(i)
χS = [ a − a ]
χT = [ b b ]
↖ x ↖ x
(i)
for some bits a and b. In the inner product, these coordinates contribute ab − ab = 0. Since we can
pair up all coordinates like this, the overall inner product is 0. 2
(^1) This doesn’t agree with the technical definition of correlation in probability, but never mind.
Here are some example functions and their Fourier transforms. In the Fourier expansions, we will
write
i∈S
in place of χS.
f Fourier transform
f (x) = 1 1
f (x) = xi xi
AND(x 1 , x 2 )
1 2
1 2
x 1 +
1 2
x 2 −
1 2
x 1 x 2
MAJ(x 1 , x 2 , x 3 )
1 2
x 1 +
1 2
x 2 +
1 2
x 3 −
1 2
x 1 x 2 x 3
f :
f^ ˆ (∅) = −^1 4
f^ ˆ ({ 1 }) = +^3 4 ˆ f ({ 2 }) = −
1 4
f^ ˆ ({ 3 }) = +^1 4
f^ ˆ ({ 1 , 2 }) = −^1 4
f^ ˆ ({ 1 , 3 }) = +^1 4
f^ ˆ ({ 2 , 3 }) = +^1 4 ˆ f ({ 1 , 2 , 3 }) = +
1 4 f (x) = −
1 4
3 4
x 1 −
1 4
x 2 +
1 4
x 3 −
1 4
x 1 x 2 +
1 4
x 1 x 3 +
1 4
x 2 x 3 +
1 4
x 1 x 2 x 3
We will now prove one of the most important, basic facts about Fourier transforms:
Theorem 2.10 (“Plancherel’s Theorem”) Let f, g : {− 1 , 1 }
n → R_. Then_
〈f, g〉 = E x∈{− 1 , 1 }n
[f (x)g(x)] =
S⊆[n]
f^ ˆ (S)ˆg(S).
This just says that when you express two vectors in an orthonormal basis, their inner product is
equal to the sum of the products of the coefficients. Proof:
〈f, g〉 =
S⊆[n]
f^ ˆ (S)χ S ,^
T ⊆[n]
ˆg(T )χT
S
T
f^ ˆ (S)ˆg(T )〈χ S , χT 〉^ (by linearity of inner product)
S
f^ ˆ (S)ˆg(S) (by orthonormality of χ’s).
Corollary 2.11 (“Parseval’s Theorem”) Let f : {− 1 , 1 }
n → R_. Then_
〈f, f 〉 = E x∈{− 1 , 1 }n
[f (x)
2 ] =
S⊆[n]
f (S)
2 .
This just says that the squared length of a vector, when expressed in an orthonormal basis, equals
the sum of the squares of the coefficients. In other words, it’s the Pythagorean Theorem.
One very important special case:
Corollary 2.12 If f : {− 1 , 1 }
n → {− 1 , 1 } is a boolean-valued function,
S⊆[n]
f^ ˆ (S)^2 = 1.