Skip to main content


By January 8, 2021No Comments

ECMM445 Learning from Data (MSc students only) Project: Data Mining and Machine Learning Submission Deadline: 12:00pm Wednesday 25 th November (Week 10) BART submission. Revised deadline: 12:00pm Monday the 11th January 2021 BART submission. Weight: 20% Project Objective: • Independent learning • Expand what was learned • Find and read papers / tutorials relevant to your project • Practice how to address a machine learning problem • Practice how to report and discuss experimental results Your task: • Identify a methodology/algorithm that was not discussed in class. You will have the opportunity to discuss your ideas and ask question during the online synchronous weekly session. • Define your project scope by selecting a suitable dataset and • Implement the experiments in Python. • Perform a comparison with a method discussed during the lectures. • Write a technical report Possible directions: • Techniques we haven’t seen o Random Forests and ensemble learning o AdaBoost o Deep Learning o Long short-term memory RNN • More complex domains/problems o Time series analyses o Healthcare data and applications o Weather data o Image classification o Recommender Systems o Financial data/application Project Deliverables • Python notebook o Your code in one notebook clearly indicating the candidate id o Data needed to replicate results / assess your work (or a link) • Technical report o One PDF file clearly indicating the candidate id o Should contain the following sections: 1. Intro: Contextualization and Motivation 2. Literature review 3. Formal description of methods 4. Discussion about experiments 5. Conclusions Please submit your BART assignment here: It is vital that you check the time and date of your deadline on BART prior to your submission. Assignments that need to be submitted through BART must be done so using the link above. BART will then guide you through the steps you need to complete for your submission. Failure to do so will result in your mark being capped for a late submission. [Full details are on the ELE page] Marking Scheme • Code: 30% o Correctness o Comments (markdown and code) o Clarity • Technical report: 70% o Convincing motivation and methodological choices o Quality of contextualization (e.g., good references and related discussion) o Depth of experimental analysis Software: • You are expected to use Python. 3. • Suitable machine learning Python packages e.g. sklearn, tensorflow, keras etc. Dataset Examples Some of the datasets are complex and you should discuss the dataset selection with your lecturer. • UCI: • Kaggle: • Healthcare Intensive Care Units (ICU) data and challenges: o MIMIC o Physionet • Awesome public datasets: o I o Time series: • Social media datasets: o SemEval annual datasets/challanges: o SNAP 欢迎咨询51作业君


Author admin

More posts by admin

Leave a Reply