ex6.m

you will be using support vector machines (SVMs) with various example 2D datasets.

  • Plot Data (in ex6data1.mat)

ex6_plotting_ex6data1.png

SVM with Linear Kernel

try using different values of the C parameter with SVMs. Informally, the C parameter is a positive value that controls the penalty for misclassified training examples.

  • Plott decision boundary (ex6data1.mat)

ex6_plotting_decision_boundary_with_C_1.png

ex6_plotting_decision_boundary_with_C_100.png

Train SVM with RBF Kernel

  • Plot Data (in ex6data2.mat)

ex6_plotting_ex6data2.png

C: 1, sigma: 0.1

  • Plot decision boundary (in ex6data2.mat)

ex6_plotting_decision_boundary_with_rbf_kernel.png

Try different SVM Parameters to train SVM with RBF Kernel

Automatically choose optimal C and sigma based on a cross-validation set.

C list: [0.01 0.03 0.1 0.3 1 3 10 30]

sigma list: [0.01 0.03 0.1 0.3 1 3 10 30]

=> optimal C = 1 and sigma = 0.1

  • Plot Data (in ex6data3.mat)

ex6_plotting_ex6data3.png

  • Plot decision boundary with optimal svm parameters (in ex6data3.mat)

ex6_plotting_decision_boundary_with_optimal_svm_parameters.png

ex6_spam.m

you will be using support vector machines to build a spam classifier.

For the purpose of this exercise, you will only be using the body of the email (excluding the email headers).

  • Preprocess sample email (in emailSample1.txt, vocab.txt)

convert each email into a vector of features

Given the vocabulary list, we can now map each word in the preprocessed emails into a list of word indices that contains the index of the word in the vocabulary list.

Lower-casing, Stripping HTML, Normalizing URLs, Normalizing Email Addresses, Normalizing Numbers, Normalizing Dollars, Word Stemming, Removal of non-words

vocabulary list: a list of 1899 words

  • Extracte Features from Emails (in emailSample1.txt)

the feature xi ∈ {0, 1} for an email corresponds to whether the i-th word in the dictionary occurs in the email. That is, xi = 1 if the i-th word is in the email and xi = 0 if the i-th word is not present in the email.

  • Train Linear SVM for Spam Classification (in spamTrain.mat, spamTest.mat)

train a SVM to classify between spam (y = 1) and non-spam (y = 0) emails.

spamTrain.mat: 4000 training examples of spam and non-spam email

spamTest.mat: 1000 test examples

Trouble shooting:

  • error on plotting the decision boundary of SVM with RBF Kernel

Solution:

rewrite visualizeBoundary.m line 21:

=> contour(X1, X2, vals, [1 1], ‘LineColor’, ‘b’);

related discussions

Reference

黄海广: 斯坦福大学机器学习课程个人笔记完整版

知乎: 机器学习有很多关于核函数的说法,核函数的定义和作用是什么?

Quora: What are Kernels in Machine Learning and SVM?

ex5.m

implement regularized linear regression and use it to study models with different bias-variance properties.

  • Plot Data (in ex5data1.mat)

ex5_plotting_data.png

  • Compute Regularized Linear Regression Cost

lambda: 1, theta: [1 ; 1]

  • Compute Regularized linear regression gradient

lambda: 1, theta: [1 ; 1]

  • Train linear regression and plot fit over the data

lambda: 0

ex5_trained_linear_regression.png

  • Comput train error and cross validation error for linear regression

lambda: 0

training error: evaluate the training error on the first i training examples (i.e., X(1:i, :) and y(1:i))

cross-validation error: evaluate on the entire cross validation set (Xval and yval).

  • Plot learning curve for linear regression

Since the model is underfitting the data, we expect to see a graph with “high bias”

ex5_learning_curve_for_linear_regression.png

  • Map X onto Polynomial Features and Normalize

X_poly(i, :) = [X(i) X(i).^2 X(i).^3 … X(i).^p]

  • Train Polynomial regression and plot fit over the data

ex5_trained_polynomial_regression.png

  • Comput train error and cross validation error for polynomial regression

lambda: 0

training error: evaluate the training error on the first i training examples (i.e., X(1:i, :) and y(1:i))

cross-validation error: evaluate on the entire cross validation set (Xval and yval).

  • Plot learning curve for polynomial regression

Since the model is overfitting the data, we expect to see a graph with “high variance”

ex5_learning_curve_for_polynomial_regression.png

  • Test various values of lambda and compute error
  • Plot validation curve

use validation curve to select the “best” lambda value

the best value of lambda is around 3

ex5_validation_curve_for_polynomial_regression.png

ex4.m

implement the backpropagation algorithm for neural networks and apply it to the task of hand-written digit recognition.

  • Plot Data (in ex4data1.mat)

ex4_plotting_data.png

  • Feedforward Using Neural Network and Compute Cost at parameters (loaded from ex4weights)
  • Cost function with regularization

lambda: 1

  • Random initialization weights

symmetry breaking: initialize weights

Theta(j, i) = RABD_NUM*(2*INIT_EPSILON) - INIT_EPSILON

RABD_NUM: between 0 to 1

  • Complete backpropagation and check Neural Network Gradients

generate some ‘random’ test data and test

input_layer_size: 3

hidden_layer_size: 5

num_labels: 3

m: 5

  • Regularized Neural Networks

lambda: 3

  • Training Neural Network

lambda: 1

  • Visualizing Weights

displaying the hidden units to see what features they are capturing in the data.

displaying images of Theta1

ex4_visualizing_nn.png

ex3.m

implement one-vs-all logistic regression and neural networks to recognize hand-written digits.

  • Plot Data (in ex3data1.mat)

ex3_plotting_data.png

  • Training One-vs-All Logistic Regression

hypothesis function: 1./(1+e.^(-1(Xtheta)))

K = 10 (0 to 9)

Iteration: 50

  • Predict for One-Vs-All

ex3_nn.m

implement a neural network to recognize handwritten digits using the same training set as before.

provided with a set of network parameters (Θ(1),Θ(2)) already trained (in ex3weights.mat)

  • Feedforward Propagation and Prediction

Loading Saved Neural Network Parameters in ex3weights.mat

ex2.m

implement logistic regression and apply it to two different datasets (ex2data1.txt, ex2data2.txt)

  • Plot Data (in ex2data1.txt)

ex2_plotting_data.png

  • Compute Cost and Gradient

hypothesis function: 1./(1+e.^(-1(Xtheta)))

  • Learning parameters using fminunc

initial theta: zeros, iteration: 400

  • Plot Decision Boundary

ex2_plotting_decisionBoundary.png

  • Predict and Accuracies

use the logistic regression model to predict the probability that a student with score 45 on exam 1 and score 85 on exam 2 will be admitted.

ex2_reg.m

The axes are the two test scores, and the positive (y = 1, accepted) and negative (y = 0, rejected) examples are shown with different markers.

  • Plot Data (in ex2data2.txt)

ex2_reg_plotting_data.png

  • Add Polynomial Features and Compute Cost

original X: [X1 X2]

mapFeatured X: [X1 X2 (X1.^2) (X2.^2) (X1X2) (X1X2.^2) …]

hypothesis function: 1./(1+e.^(-1(Xtheta)))

  • Plot Decision Boundary with lambda 0

ex2_reg_plotting_decisionBoundary_with_lambda_0.png

  • Plot Decision Boundary with lambda 1

ex2_reg_plotting_decisionBoundary_with_lambda_1

  • Plot Decision Boundary with lambda 100

ex2_reg_plotting_decisionBoundary_with_lambda_100