CSCI 744 Pattern Recognition and Machine Learning

Reading assignment: Textbook Chapter 2

Part I Single Feature

1. Create a matlab script that will perform each of the steps required for this exercise.

2. Load βpartOneData.matβ into the matlab environment (included in blackboard as part of the assignment).

3. Create a histogram for each of the class distributions {classOne, classTwo}. Plot each of the histograms on the same figure (use 100 bins). The x and y axis should be labeled appropriately. There should be a title for the figure as well as a legend.

4. Report the prior probability for classOne? (Hint: Number of classOne samples divided by all samples)

5. Report the prior probability for classTwo? (Hint: see above hint, but for classTwo)

6. Create 5 random partitions of the data, splitting each of the classes into 60% training and 40% testing.

a. Using only the training data, find the maximum likelihood estimator for the following parameters:

i. πΆπππ π πππ: π,π

ii. πΆπππ π ππ€π: π,π

b. Classify each of the test samples using a Bayesian classifier (you must create a function that will do this). Report the prediction accuracy for each class.

7. Report the mean and standard deviation for the prediction accuracy from step 6.

Hint: You will need to create a method that, given the mean and standard deviation of a distribution, determines the probability of a value βxβ belonging to that distribution.

Matlab template below:

function probability = computeGaussianDensity(mean, stdDev, x)

βͺYour code hereβ«

end

Part II Multivariate

1. Create a matlab script that will perform each of the steps required for this exercise.

2. Load βpartTwoData.matβ into the matlab environment (included in blackboard as part of the assignment).

3. Report the prior probability for classOne?

4. Report the prior probability for classTwo?

5. Create 5 random partitions of the data, splitting each of the classes into 60% training and 40% testing.

a. Repeat the following process for each of the 5 random partitions:

i. Using only the training data, find the maximum likelihood estimator for the following parameters:

1. πΆπππ π πππ: π,πππ£πππππππ πππ‘πππ₯

2. πΆπππ π ππ€π: π,πππ£πππππππ πππ‘πππ₯

ii. Classify each of the test samples using a Bayesian classifier (you must create a function that will do this). Report the prediction accuracy for each class.

6. Report the mean and standard deviation for the prediction accuracy from step 5.

Hint: You will need to create a method that, given the mean and covariance matrix, determines the probability of a value βxβ belonging to the distribution.

Matlab template below:

function probability = computeGaussianDensityMultivariate(mean, covarianceMatrix, x)

βͺYour code hereβ«

end