- May 24, 2020

COMP7703 – Machine Learning Take Home Exam (Worth 10% of total marks for the course) Marcus Gallagher May 2020 1 Instructions Complete all questions. Submit your answers in Blackboard using the test item in the Assessment Section. You can use whatever resources you wish to help you complete the questions (e.g. Matlab/python, books, web), but you are strongly encouraged to complete the exam individually rather than discussing or working with other students. It is up to you to decide how much time you spend on this task, but the intention is that it will not take more than approx. 4hrs to complete. Give your answers correct to 5 significant digits. 2 Questions Questions 1-5 use the sonarmini.csv dataset (available on the course black- board site. 1. Calculate the absolute difference between the biased (Maximum Likeli- hood Estimator) and unbiased estimators of the sample variance of the second column of the dataset. 2. Using the Manhattan distance, which data points are the 3 nearest-neighbours of the 5th point (i.e. row 5) in the dataset? (Use only the first two columns for this question; the third column is a class label). 3. Let xi be the ith feature/column in the dataset. Consider the classification rule: “Output 1 if x2 < 0.05, else output 0”. How many points in the dataset are classified correctly with this rule? 4. Consider fitting a 2D histogram to columns 1 and 2 of the dataset: • three bins of equal width in each dimension (i.e. 9 bins) • bins span the range [0, 0.15] for x1 and [0, 0.24] for x2. 1 Which bin has the greatest height? 5. Consider constructing a dendrogram of rows 6-10 of this data, using single- link clustering. Ignore the third column and use Euclidean distance. On the first iteration, points/rows/groups 6 and 9 are merged into a group. One the second iteration, which two groups are merged? Give the row numbers of all the points in the newly merged group as your answer. 6. Consider a Gaussian mixture model p(x|θ) = 1 3 N ( (1, 2)′, ( 2 0 0 0.5 )) + 1 3 N ( (−3,−5)′, ( 1 0 0 1 )) + 1 3 N ( (0, 1)′, ( 0.6 0.5 0.5 1.6 )) Calculate the probability density of the model at the point x′ = (1.2, 1.2). 7. Find the Mahalanobis distance1 from the point x′ = (1, 1) to the Gaussian N ( (0, 1)′, ( 0.6 0.5 0.5 1.6 )) 8. The volume of the hyperellipsoid corresponding to a Mahalanobis distance r is given by: V = Vd|Σ|1/2rd where Vd is the volume of a d-dimensional unit hypersphere: Vd = { pid/2/(d/2)!, d even 2dpi(d−1)/2(d−12 )!/d!, d odd. Calculate this volume with r = 6 and the Gaussian: N (0, 0, 0, 0)′, 0.6 0.5 0.5 0.5 0.5 1.6 0.5 0.5 0.5 0.5 0.6 0.5 0.5 0.5 0.5 1.6 9. It is often useful to be able to compute things “on-line” (i.e. recursively) with respect to a dataset (meaning, e.g. that the entire dataset does not need to be stored in memory). The sample mean of a scalar variable, x can be calculated in this way using: µˆn+1 = µˆn + 1 n+ 1 (xn+1 − µˆn) If we have previously observed 10 data points and the sample mean is 5.0, what would the next observation need to be to change the sample mean to equal 6.0? 1The Mahalanobis distance is defined as: D = √ ((x− µ)′S−1(x− µ)) (see, for example Wikipedia or the [DHS] book). Unfortunately, however, the Alpaydin book gives the impres- sion that it is defined without the square root. Wikipedia calles this the “generalized squared interpoint distance” and provides a citation. 2 10. Consider carrying out gradient descent on the function f(x) = x4. Start- ing from an initial position x = 2 with a step size η = 0.1, calculate the value of the search position after three updates. 3