## ex6.m

you will be using support vector machines (SVMs) with various example 2D datasets.

- Plot Data (in ex6data1.mat)

### SVM with Linear Kernel

try using different values of the C parameter with SVMs. Informally, the C parameter is a positive value that controls the penalty for misclassified training examples.

- Plott decision boundary (ex6data1.mat)

### Train SVM with RBF Kernel

- Plot Data (in ex6data2.mat)

C: 1, sigma: 0.1

- Plot decision boundary (in ex6data2.mat)

### Try different SVM Parameters to train SVM with RBF Kernel

Automatically choose optimal C and sigma based on a cross-validation set.

C list: [0.01 0.03 0.1 0.3 1 3 10 30]

sigma list: [0.01 0.03 0.1 0.3 1 3 10 30]

=> optimal C = 1 and sigma = 0.1

- Plot Data (in ex6data3.mat)

- Plot decision boundary with optimal svm parameters (in ex6data3.mat)

## ex6_spam.m

you will be using support vector machines to build a spam classifier.

For the purpose of this exercise, you will only be using the body of the email (excluding the email headers).

- Preprocess sample email (in emailSample1.txt, vocab.txt)

convert each email into a vector of features

Given the vocabulary list, we can now map each word in the preprocessed emails into a list of word indices that contains the index of the word in the vocabulary list.

Lower-casing, Stripping HTML, Normalizing URLs, Normalizing Email Addresses, Normalizing Numbers, Normalizing Dollars, Word Stemming, Removal of non-words

vocabulary list: a list of 1899 words

- Extracte Features from Emails (in emailSample1.txt)

the feature xi ∈ {0, 1} for an email corresponds to whether the i-th word in the dictionary occurs in the email. That is, xi = 1 if the i-th word is in the email and xi = 0 if the i-th word is not present in the email.

- Train Linear SVM for Spam Classification (in spamTrain.mat, spamTest.mat)

train a SVM to classify between spam (y = 1) and non-spam (y = 0) emails.

spamTrain.mat: 4000 training examples of spam and non-spam email

spamTest.mat: 1000 test examples

## Trouble shooting:

- error on plotting the decision boundary of SVM with RBF Kernel

Solution:

rewrite visualizeBoundary.m line 21:

=> contour(X1, X2, vals, [1 1], ‘LineColor’, ‘b’);