Download The Joy of Data-Plus, Bonus Feature: Fun with Differentiation | CS 591 and more Assignments Programming Languages in PDF only on Docsity!
The joy of data
Plus, bonus feature: fun with differentiation
Reading: DH&S Ch. 9.6.0-9.6.
Administrivia
- Homework 1 due date moved to this Thurs (Feb 2)
- If you didn’t know this, subscribe to ml-class mail list
HW1 FAQ
- Q1:^ Waaaaah! I don’t know where to start!
- A1:^ Be at peace grasshopper. Enlightenment is not a path to a door, but a road leading forever towards the horizon.
- A1’:^ I suggest the following problem order: 8, 1, 11, 5, 6
HW1 FAQ
- Q2:^ For problem 11, what should I turn in for the answer?
- A2:^ Basically, this calls for 3 things:
- An algorithm (pseudocode) demonstrating how to learn a cost-sensitive tree
- csTree=buildCostSensDT(X,y,Lambda)
- An algorithim (pseudocode) for classifying a point given such a tree
- label=costDTClassify(x)
- A description of why these are the right algorithms
HW1 FAQ
- Q4:^ What’s with that whole concavity thing?
- A4:^ It all boils down to this picture:
(^00) 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1
p(x)
Entropy of [p(x),
−p(x)]
P 1
i ( P 1 )
Pa (^1) Pb 1
}
Δ i ( N )
HW1 FAQ
- Q4:^ What’s with that whole concavity thing?
- A4:^ It all boils down to this picture:
- Now you just have to prove that algebraically, for the general ( c>2 ) case...
HW1 FAQ
- Q6:^ How do I show what the maximum entropy of something is? Or that it’s concave, for that matter?
- A6:^ That brings us to...
5 minutes of math
- Finding the maximum of a function (of 1 variable):
- Old rule from calculus:
f (x)
d 2 dx 2
f (x) < 0
d dx
f (x) = 0
5 minutes of math
- Rule is: for a scalar function of a vector argument
- First derivative w.r.t.^ x^ is the^ vector^ of first partials: f (x) = − 2 x^21 + 2x 1 x 2 − 2 x^22 + 2x 2 x 3 − 2 x^23
∂x
f (x) =
− 4 x 1 + 2x 2 2 x 1 − 4 x 2 + 2x 3 2 x 2 − 4 x 3
5 minutes of math
- Second derivative:
- Matrix of all possible second partial combinations
∂ 2 ∂x 2 f^ (x) =
∂ 2 ∂x^21 f^ (x)^
∂ 2 ∂x 1 x 2 f^ (x)^ · · ·^
∂ 2 ∂x 1 xd f^ (x) ∂ 2 ∂x 2 x 1 f^ (x)^
∂ 2 ∂x^22 f^ (x) ..
....^
.. . ∂x^ ∂^2 d x^1 f^ (x)^ ∂x∂^2 d x^2 f^ (x)^ · · ·^
∂ 2 ∂x^2 d^ f^ (x)
5 minutes of math
- Equivalent of the second derivative test:
- Hessian matrix is negative definite
- I.e., all eigenvalues of^ H^ are negative
- Use this to show
- An extremum is a maximum
- System is concave
Exercise
- Given the function:
- Find the extremum
- Show that the extremum is really a minimum
f (x) = 2x^21 + 4x 1 x 2 + x^22
Separation of train & test
• Fundamental principle^ (1st amendment of
ML):
• Don’t evaluate accuracy
(performance) of your classifier
(learning system) on the same
data used to train it!
Holdout data
- Usual to “hold out” a separate set of data for testing; not used to train classifier
- A.k.a., test set, holdout set, evaluation set, etc.