辅导案例-EE4211/TEE4211

EE4211/TEE4211 Mehul Motani – Homework 2 Page 1 of 2 Homework 2 2-1. Decision Trees I mentioned in class that the number of possible decision trees is very large. How many decision trees exist with n binary attributes? Here is way to think about the problem. • Suppose you have one binary attribute. Then there are 2^1=2 possible values for the attribute and each of those values can be mapped to 2 outputs, so there are 4 decision trees. • Suppose you have two binary attributes. Then there are 2^2=4 possible values for the pair of attributes, and each value can be mapped to 2 outputs, so there are 2^4=16 decision trees. • Now suppose you have n attributes. How many decision trees are there? 2-2. Decision Trees Consider the following training set with features A, B, C, and target/label Y. a. What is the entropy of the output Y? b. Using the information gain criterion, what is the first node you would split at? Explain clearly why? c. Using the information gain criterion, complete the learning of the decision tree for this dataset. Draw the decision tree and comment if the tree is unique. 2-3. Support Vector Machines (SVM) Consider building an SVM for the following two-class training data: Positive class: (-1, 3) (0, 2) (0, 1) (0, 0) Negative class: (1, 5) (1, 6) (3, 3) a. Plot the training points and, by inspection, draw a linear classifier that separates the data with maximum margin. b. The linear SVM is parameterized by h(x) = (w^t)(x) + b. What are the parameters w and b for this problem? c. Suppose you observe an additional set of points, all from the positive class. What is the linear SVM (in terms of w and b) now? More positive points: (−2, 0) (−2, 1) (−2, 3) (−1, 0) (−1, 1) (0, 0) EE4211/TEE4211 Mehul Motani – Homework 2 Page 2 of 2 2-4. Performance metrics Suppose you are given the same test set and two binary classifiers. Is it possible that Classifier 1 has higher accuracy than Classifier 2, but Classifier 2 has both higher precision and higher recall than Classifier 1? If this is not possible, please clearly explain why. If this is possible, please provide an example of a 2 × 2 table for each classifier, each giving counts of true positives, false positives, true negatives, and false negatives. 2-5. Programming In this problem, we will look at the Breast Cancer Wisconsin (Diagnostic) Data Set available at: https://archive.ics.uci.edu/ml/datasets/Breast+Cancer+Wisconsin+%28Diagnostic%29. Compute the performance of various machine learning algorithms (see below) on this dataset for predicting the whether the diagnosis is malignant or benign. Use a random split of 70% of the data for training and 30% for testing. Repeat this process 20 times and compute the average performance for both the training and testing stages. Algorithms: • DT1: Decision Tree with Information Gain • DT2: Same as DT1 with limited tree size, vary the number of levels to beat DT1 if you can. • SVM1: SVM with linear kernel • SVM2: SVM with RBF kernel • SVM3: Same as SVM2 but with regularization (soft margin), Choose C to beat SVM1 and SVM2 if you can. Accuracy Precision Recall Train Test DT1 DT2 SVM1 SVM2 SVM3

辅导案例-EE4211/TEE4211

Related

Previous Post辅导案例-FIT2004

Next Post辅导案例-STAT 425

Author admin