Docsity
Docsity

Prepare for your exams
Prepare for your exams

Study with the several resources on Docsity


Earn points to download
Earn points to download

Earn points by helping other students or get them with a premium plan


Guidelines and tips
Guidelines and tips

Boolean Functions: Linearity and Fourier Expansion (CMU 18-859S, Spring 2007, Lecture 2), Slides of Computer Architecture and Organization

A lecture note from carnegie mellon university (cmu) on the topic of boolean functions, specifically focusing on linearity and fourier expansion. The lecture, given by ryan o’donnell, covers the definitions of linearity for boolean functions, approximate linearity, and testing linearity using the blr test. The document also introduces the concept of fourier expansion and the use of parity functions in analyzing boolean functions.

Typology: Slides

2010/2011

Uploaded on 10/07/2011

rolla45
rolla45 🇺🇸

4

(6)

133 documents

1 / 8

Toggle sidebar

This page cannot be seen from the preview

Don't miss anything!

bg1
Analysis of Boolean Functions (CMU 18-859S, Spring 2007)
Lecture 2: Linearity and the Fourier Expansion
Jan. 18, 2005
Lecturer: Ryan O’Donnell Scribe: Ryan O’Donnell
1 Linearity
What does it mean for a boolean function to be linear? For the question to make sense, we must
have a notion of adding two binary strings. So let’s take
f:{0,1}n {0,1}, and treat {0,1}as F2.
Now there are two well-known classical notions of being linear:
Definition 1.1
(1) fis linear iff f(x+y) = f(x) + f(y)for all x, y {0,1}n.
(2) fis linear iff there are some a1, . . . , anF2such that f(x1, . . . , xn) = a1x1+· · · +anxn
there is some S[n]such that f(x) = PiSxi.
(Sometimes in (2) one allows an additive constant; we won’t, calling such functions affine.)
Since these definitions sound equally good we may hope that they’re equivalent; happily, they
are. Now (2) (1) is easy:
(2) (1) : f(x+y) = P
iS
(x+y)i=P
iS
xi+P
iS
yi=f(x) + f(y).
But (1) (2) is a bit more interesting. The easiest proof:
(1) (2) : Define αi=f(
ei
z }| {
0, . . . , 0,1,0, . . . , 0). Now repeated use of
condition 1 implies f(x1+x2+· · · +xn) = f(x1) + · · · +f(xn), so indeed
f((x1, . . . , xn)) = f(Pxiei) = Pxif(ei) = Pαixi.
1
pf3
pf4
pf5
pf8

Partial preview of the text

Download Boolean Functions: Linearity and Fourier Expansion (CMU 18-859S, Spring 2007, Lecture 2) and more Slides Computer Architecture and Organization in PDF only on Docsity!

Analysis of Boolean Functions (CMU 18-859S, Spring 2007)

Lecture 2: Linearity and the Fourier Expansion

Jan. 18, 2005

Lecturer: Ryan O’Donnell Scribe: Ryan O’Donnell

1 Linearity

What does it mean for a boolean function to be linear? For the question to make sense, we must

have a notion of adding two binary strings. So let’s take

f : { 0 , 1 }

n → { 0 , 1 }, and treat { 0 , 1 } as F 2.

Now there are two well-known classical notions of being linear:

Definition 1.

(1) f is linear iff f (x + y) = f (x) + f (y) for all x, y ∈ { 0 , 1 }

n .

(2) f is linear iff there are some a 1 ,... , an ∈ F 2 such that f (x 1 ,... , xn) = a 1 x 1 + · · · + anxn

there is some S ⊆ [n] such that f (x) =

i∈S xi_._

(Sometimes in (2) one allows an additive constant; we won’t, calling such functions affine .)

Since these definitions sound equally good we may hope that they’re equivalent; happily, they

are. Now (2) ⇒ (1) is easy:

(2) ⇒ (1) : f (x + y) =

i∈S

(x + y)i =

i∈S

xi +

i∈S

yi = f (x) + f (y).

But (1) ⇒ (2) is a bit more interesting. The easiest proof:

(1) ⇒ (2) : Define αi = f (

ei ︷ ︸︸ ︷

0 ,... , 0 , 1 , 0 ,... , 0). Now repeated use of

condition 1 implies f (x

1

  • x

2

  • · · · + x

n ) = f (x

1 ) + · · · + f (x

n ), so indeed

f ((x 1 ,... , xn)) = f (

xiei) =

xif (ei) =

αixi.

1.1 Approximate Linearity

Nothing in this world is perfect, so let’s ask: What does it mean for f to be approximately linear?

Here are the natural first two ideas:

Definition 1.

(

) f is approximately linear if f (x + y) = f (x) + f (y) for most pairs x, y ∈ { 0 , 1 }

n .

(

) f is approximately linear if there is some S ⊆ [n] such that f (x) =

i∈S

xi for most

x ∈ { 0 , 1 }

n .

Are these two equivalent? It’s easy to see that (

′ ) ⇒ (

′ ) still essentially holds: If f has the

right value for both x and y (which happens for most pairs), the equation in the (2) ⇒ (1) proof

holds up.

The reverse implication is not clear: Take any linear function and mess up its values on

e 1 ,... , en. Now f (x + y) = f (x) + f (y) still holds whenever x and y are not ei’s, which is

true for almost all pairs. But now the equation in the (1) ⇒ (2) proof is going to be wrong for very

many x’s. So this proof doesn’t work — but actually our f does satisfy (

′ ), so maybe a different

proof will work.

We will investigate this shortly, but let’s first decide on (

′ ) as our official definition:

Definition 1.3 f, g : { 0 , 1 }

n → { 0 , 1 } are ≤-close if they agree on a (1 − ≤) -fraction of the inputs

{ 0 , 1 }

n

. Otherwise they are ≤-far_._

Definition 1.4 f is ≤-close to having property P if there is some g with property P such that f and

g are-close.

A “property” here can really just be any collection of functions. For our current discussion, P is

the set of 2

n linear functions.

1.2 Testing Linearity

Given that we’ve settled on definition (

′ ), why worry about definition (

′ )? Imagine someone

hands you some black-box software f that is supposed to compute some linear function, and your

job is to test it — i.e., try to identify bugs. You can’t be sure f is perfect unless you “query” its

value 2

n times, but perhaps you can become convinced f is ≤-close to being linear with many fewer

queries.

If you knew which linear function f was supposed to be close to, you could just check it on

O(1/≤) many random values — if you found no mistakes, you’d be quite convinced f was ≤-close

to linear.

Do the same for all 2

n linear (Parity) functions:

χ∅ =

, χ{ 1 } =

,... , χ[n] =

Notation: χS is Parity on the coordinates in set S; [n] = { 1 , 2 ,... , n}.

Now it’s easy the closest Parity to f is the physically closest vector.

f

χ S 1

χ S 2

χ S 3

f is closest to χS 1

It’s extra-convenient if we replace 0 and 1 with 1 and − 1 ; then the dot product of two vec-

tors measures their closeness (the bigger the dot product, the closer). This motivates the Great

Notational Switch we’ll use 99% of the time.

Great Notational Switch: 0/False → +1, 1/True → − 1.

We think of +1 and − 1 here as real numbers. In particular, we now have:

Addition (mod 2) → Multiplication (in R).

We now write:

A generic boolean function: f : {− 1 , 1 }

n → {− 1 , 1 }.

The Parity on bits S function, χS : {− 1 , 1 }

n → {− 1 , 1 }:

χS (x) =

i∈S

xi.

We now have:

Fact 2.1 The dot product of f and χS , as vectors in {− 1 , 1 }

2 n , equals

(# x ’s such that f (x) = χS (x) )(# x ’s such that f (x) 6 = χS (x) ).

Definition 2.2 For any f, g : {− 1 , 1 }

n → R , we write

〈f, g〉 =

2 n^

( dot product of f and g as vectors )

= avg x∈{− 1 , 1 }n

[f (x)g(x)] = E x∈{− 1 , 1 }n

[f (x)g(x)].

We also call this the correlation of f and g

1 .

Fact 2.3 If f and g are boolean-valued, f, g : {− 1 , 1 }

n → {− 1 , 1 } , then 〈f, g〉 ∈ [− 1 , 1]. Further,

f and g are-close iff 〈f, g〉 ≥ 1 − 2 ≤.

Now in our linearity testing problem, given f : {− 1 , 1 }

n → {− 1 , 1 } we are interested in the

Parity function having maximum correlation with f. Let’s give notation for these correlations:

Definition 2.4 For S ⊆ [n] , we write

f^ ˆ (S) = 〈f, χ S 〉

Now with the switch to − 1 and 1 , something interesting happens with the 2

n Parity functions;

they become orthogonal vectors:

Proposition 2.5 If S 6 = T then χS and χT are orthogonal; i.e., 〈χS , χT 〉 = 0_._

Proof: Let i ∈ S∆T (the symmetric difference of these sets); without loss of generality, say

i ∈ S \ T. Pair up all n-bit strings: (x, x

(i) , where x

(i) denotes x with the ith bit flipped.

Now the vectors χS and χT look like this on “coordinates” x and x

(i)

χS = [ a − a ]

χT = [ b b ]

↖ x ↖ x

(i)

for some bits a and b. In the inner product, these coordinates contribute ab − ab = 0. Since we can

pair up all coordinates like this, the overall inner product is 0. 2

(^1) This doesn’t agree with the technical definition of correlation in probability, but never mind.

2.1 Examples

Here are some example functions and their Fourier transforms. In the Fourier expansions, we will

write

i∈S

in place of χS.

f Fourier transform

f (x) = 1 1

f (x) = xi xi

AND(x 1 , x 2 )

1 2

1 2

x 1 +

1 2

x 2 −

1 2

x 1 x 2

MAJ(x 1 , x 2 , x 3 )

1 2

x 1 +

1 2

x 2 +

1 2

x 3 −

1 2

x 1 x 2 x 3

f :

f^ ˆ (∅) = −^1 4

f^ ˆ ({ 1 }) = +^3 4 ˆ f ({ 2 }) = −

1 4

f^ ˆ ({ 3 }) = +^1 4

f^ ˆ ({ 1 , 2 }) = −^1 4

f^ ˆ ({ 1 , 3 }) = +^1 4

f^ ˆ ({ 2 , 3 }) = +^1 4 ˆ f ({ 1 , 2 , 3 }) = +

1 4 f (x) = −

1 4

3 4

x 1 −

1 4

x 2 +

1 4

x 3 −

1 4

x 1 x 2 +

1 4

x 1 x 3 +

1 4

x 2 x 3 +

1 4

x 1 x 2 x 3

2.2 Parseval, Plancherel

We will now prove one of the most important, basic facts about Fourier transforms:

Theorem 2.10 (“Plancherel’s Theorem”) Let f, g : {− 1 , 1 }

n → R_. Then_

〈f, g〉 = E x∈{− 1 , 1 }n

[f (x)g(x)] =

S⊆[n]

f^ ˆ (S)ˆg(S).

This just says that when you express two vectors in an orthonormal basis, their inner product is

equal to the sum of the products of the coefficients. Proof:

〈f, g〉 =

S⊆[n]

f^ ˆ (S)χ S ,^

T ⊆[n]

ˆg(T )χT

S

T

f^ ˆ (S)ˆg(T )〈χ S , χT 〉^ (by linearity of inner product)

S

f^ ˆ (S)ˆg(S) (by orthonormality of χ’s).

Corollary 2.11 (“Parseval’s Theorem”) Let f : {− 1 , 1 }

n → R_. Then_

〈f, f 〉 = E x∈{− 1 , 1 }n

[f (x)

2 ] =

S⊆[n]

f (S)

2 .

This just says that the squared length of a vector, when expressed in an orthonormal basis, equals

the sum of the squares of the coefficients. In other words, it’s the Pythagorean Theorem.

One very important special case:

Corollary 2.12 If f : {− 1 , 1 }

n → {− 1 , 1 } is a boolean-valued function,

S⊆[n]

f^ ˆ (S)^2 = 1.