Super VIP Cheatsheet: Machine Learning

CS 229 – Machine Learning https://stanford.edu/~shervine

Afshine Amidi and Shervine Amidi

September 15, 2018

Contents

1Supervised Learning 2

1.1 Introduction to Supervised Learning ................... 2

1.2 Notations and general concepts ..................... 2

1.3 Linear models .............................. 2

1.3.1 Linear regression ......................... 2

1.3.2 Classification and logistic regression ............... 3

1.3.3 Generalized Linear Models .................... 3

1.4 Support Vector Machines ........................ 3

1.5 Generative Learning ........................... 4

1.5.1 Gaussian Discriminant Analysis ................. 4

1.5.2 Naive Bayes ........................... 4

1.6 Tree-based and ensemble methods .................... 4

1.7 Other non-parametric approaches .................... 4

1.8 Learning Theory ............................. 5

2Unsupervised Learning 6

2.1 Introduction to Unsupervised Learning ................. 6

2.2 Clustering ................................ 6

2.2.1 Expectation-Maximization .................... 6

2.2.2 k-means clustering ........................ 6

2.2.3 Hierarchical clustering ...................... 6

2.2.4 Clustering assessment metrics .................. 6

2.3 Dimension reduction ........................... 7

2.3.1 Principal component analysis .................. 7

2.3.2 Independent component analysis ................. 7

3Deep Learning 8

3.1 Neural Networks ............................. 8

3.2 Convolutional Neural Networks ..................... 8

3.3 Recurrent Neural Networks ....................... 8

3.4 Reinforcement Learning and Control ................... 9

4Machine Learning Tips and Tricks 10

4.1 Metrics .................................. 10

4.1.1 Classification ........................... 10

4.1.2 Regression ............................ 10

4.2 Model selection .............................. 11

4.3 Diagnostics ................................ 11

5Refreshers 12

5.1 Probabilities and Statistics ........................ 12

5.1.1 Introduction to Probability and Combinatorics ......... 12

5.1.2 Conditional Probability ..................... 12

5.1.3 Random Variables ........................ 13

5.1.4 Jointly Distributed Random Variables .............. 13

5.1.5 Parameter estimation ...................... 14

5.2 Linear Algebra and Calculus ....................... 14

5.2.1 General notations ........................ 14

5.2.2 Matrix operations ........................ 15

5.2.3 Matrix properties ........................ 15

5.2.4 Matrix calculus ......................... 16

Stanford University 1Fall 2018

Super VIP Cheatsheet: Machine Learning, Cheat Sheet of Machine Learning

Related documents

Partial preview of the text

Download Super VIP Cheatsheet: Machine Learning and more Cheat Sheet Machine Learning in PDF only on Docsity!

CS 229 – Machine Learning https://stanford.edu/~shervine

1 Supervised Learning

1.1 Introduction to Supervised Learning

1.2 Notations and general concepts

[

]

∑^ m

1.3 Linear models

1.3.1 Linear regression

∑^ l

1.5 Generative Learning

1.5.1 Gaussian Discriminant Analysis

̂ φ μ̂ j (j = 0,1) ̂Σ

∑^ m

∑ m

∑ m

∑^ m

1.5.2 Naive Bayes

∏^ n

1.6 Tree-based and ensemble methods

1.7 Other non-parametric approaches

1.8 Learning Theory

of parameter φ. Let φ̂ be their sample mean and γ > 0 fixed. We have:

P (|φ − ̂φ| > γ) 6 2 exp(− 2 γ^2 m)

r Training error – For a given classifier h, we define the training error ̂(h), also known as the

̂ (h) =

∑^ m

(̂h) 6

Remark: the VC dimension of H = { set of linear classifiers in 2 dimensions } is 3.

(̂h) 6

+ O

(√^

∑^ k

∑^ m

×

2.3 Dimension reduction

2.3.1 Principal component analysis

∑^ m

∑^ m

∑^ m

2.3.2 Independent component analysis

∏^ n

∑^ m

∑ n

 x

3 Deep Learning

3.1 Neural Networks

[

]

×

×

3.2 Convolutional Neural Networks

N =

W − F + 2P

S

3.3 Recurrent Neural Networks

4 Machine Learning Tips and Tricks

4.1 Metrics

4.1.1 Classification

TP FN

FP TN

TP + TN

TP + TN + FP + FN

TP

TP + FP

TP

TP + FN

TN

TN + FP

2 TP

2 TP + FP + FN

TP

TP + FN

FP

TN + FP

r Training error – For a given classifier h, we define the training error ̂(h), also known as the

̂ (h) =

(̂h) 6

(̂h) 6