Skip to main content
留学咨询

辅导案例-CSCI 4144

By May 15, 2020No Comments

1 CSCI 4144 – Data Mining and Data Warehousing Course Project: In-depth Understanding of an OLAP or Data Mining Algorithm – TA: Serikzhan Kazi ([email protected]), Miheer Kulkarni ([email protected]) – Tutorial/Lab: 11:35am – 12:55pm, Wednesdays; Room: Goldberg 127 – Additional TA Help Hours at CS Learning Center: o Mondays (2pm-4pm): Zhenbang Wang ([email protected]) o Wednesdays (2pm-4pm): Hui Huang ([email protected]) o Fridays (3:35pm-5pm): Lauchlan Toal ([email protected]) 1. Overview In this project, you need to select a research paper that includes an OLAP or data mining algorithm, implement the algorithm, and discuss the performance of the algorithm. The major objective of this project is to learn how to find a useful OLAP/data mining algorithm and have an in-depth understanding of it. 2. Detailed Requirements 1) Group Size: You are allowed to work in a group with up to 3 members. If you prefer to complete an individual project, it is also acceptable. 2) Deliverables: In this project, you need to complete the following deliverables (Note that only one member of each group needs to submit these deliverables on behalf of the group): a) Project Proposal: Due 11:55pm, Mar. 1 (Sunday) b) Project Presentation Slides: The slides are due 11:55pm, Apr. 3 (Friday). a. The presentations will be held during the lecture/tutorial on Apr. 1 and 2. The detailed schedule will be announced after the project groups are formed. c) Project Code: Due 11:55pm Apr. 17 (Friday) d) Project Report: Due 11:55pm Apr. 17 (Friday) 3) Paper Selection: You can select a paper using one of the following two approaches: a) IEEE Xplore and ACM Digital Library are two widely-used online libraries. You can search for research papers using varied keywords, such as OLAP, classification, clustering, web classification, and web clustering. The links to these two libraries can be found here: http://dal.ca.libguides.com/c.php?g=257110&p=1716818 b) Browse the webpage of varied conferences (such as KDD 2019) and journals (such as IEEE Transactions on Knowledge and Data Engineering) in order to go through the latest research papers, then select one that interests you. Once a paper is selected, you can use the resources available at Dalhousie library (such as IEEE Xplore and ACM Digital Library) 2 to find the full paper. A pdf file that summarizes the major publication venues on data mining and data warehousing can be found in brightspace. c) You are NOT allowed to select a paper that corresponds to any of the algorithms that are covered in this course. However, a revised/improved version of the algorithm covered in my slides is acceptable. Here is a list of the algorithms that I plan to cover in this course: a. OLAP: Multi-Way Array Aggregation, BUC, High-Dimensional OLAP b. Frequent Itemset Mining: Apriori Algorithm c. Classification: ID3, Naive Bayesian Classification, Classification Based on IF-THEN Rules d. Clustering: k-means, AGNES, DIANA, Dendrogram, DBSCAN 4) Project Proposal: The length of the project proposal is approximately 1 page. It should include the following parts: a) Tentative title of the project: Note that the title could be revised later. b) Tentative selected research paper: It should include the list of authors, paper title, publication venue (i.e. where the paper is published), and publication time. If you find a more interesting research paper later, you can change the selected research paper (although this is not preferred because it will leave less time to you to implement the algorithm and collect experimental results). c) Problem description: It describes the problem to be solved with the algorithm presented in the selected paper. d) Project timeline: It should include the major milestones of the project and your own deadlines for them. e) Group Members: A list of the members in the project group 5) Project Presentation: The details about the presentation can be found below: a) The presentation should include the following sections: problem to be solved, how the algorithm works, implementation details (programming language, data set, etc.), and experimental results (note that preliminary results are acceptable). b) The presentation should be roughly 4 minutes long. It is encouraged that all group members participate in the presentation. c) The presentation will be held during the lecture/tutorial on Apr. 1 and 2. The detailed schedule will be announced after the project groups are formed. d) The presentation slides need to be submitted via brightspace on Apr. 3 (Friday). 6) Project Code: In this project, you need to implement the selected algorithm in order to collect the experimental results about the performance of the algorithm. a) Required Programming Language: You can use Java, C, C++, or Python as the programming language because bluenose supports these languages (note that you need to have a CSID to access bluenose via SSH). In this project, you can use all kinds of libraries/APIs as long as they are available on bluenose. 3 b) Online Data Sets: To study the performance of the selected OLAP or data mining algorithm, you might need to use some data sets. You can either create the data sets yourself or utilize the online data sources such as those on the following webpage: a. http://www.kdnuggets.com/datasets/index.html 7) Readme File: You need to complete a readme file named “Readme.txt”, which includes the instructions that the TA could use to compile and execute your program to generate the experimental results. 8) Project Report: a) Report components: a. A cover page that includes the title of your survey, your group ID (note that each group will be assigned a group ID), the name and banner ID of group members. b. Introduction: A brief description of the problem to be tackled. c. Algorithm: Please describe how the selected algorithm works. d. Data Preparation and Algorithm Implementation: i. Please describe how the algorithm is implemented. You could include the information about the selected programming language, the structure of the program, the major classes, etc. ii. If a data set is involved, please describe how the data is obtained and processed. If not, you do not need to include the description about data preparation. e. Experimental results: You need demonstrate the performance of the algorithm by including the detailed experimental results. When possible, please compare your results with the results included in the selected paper. f. Conclusion: Please include your comments about the algorithm. g. List of References b) Report format: a. Line spacing: single space b. Font size: 11 or smaller c. Column per page: single-column d. Report length: Your report should be at most 6 pages long. Namely, with the cover page, your report should be at most 7 pages long. 9) Submission: You need to submit the following deliverables via brightspace. Please note that each group only needs to submit 1 Project Proposal, 1 copy of Project Presentation Slides, 1 Project Code file, and 1 Project Report. Namely, only one member of a group needs to submit the project-related files on behalf of the whole group. In addition, please pay attention to the following requirements: a) Project Proposal: You need to convert your project proposal into a pdf file. The name of the pdf file should be “CSCI4144-ProjectProposal-YourFirstname-YourLastName.pdf”. For example, my proposal file should be named “CSCI4144-ProjectProposal-Qiang-Ye.pdf”. The project proposal should be submitted via brightspace. 4 b) Project Presentation Slides: You need to convert your project presentation slides into a pdf file. The name of the pdf file should be “CSCI4144-ProjectPresentation-Group- YourGroupID.pdf”. For example, if the group ID is 12, then the pdf file should be named “CSCI4144-ProjectPresentati
on-Group-12.pdf”. The project presentation slides should be submitted via brightspace. c) Project Code: a. You should place “Readme.txt” in the directory where your program file is located. b. Your program file and “Readme.txt” should be compressed into a zip file named “CSCI4144-ProjectCode-Group-YourGroupID.zip”. For example, if the group ID is 12, then the zip file should be called “CSCI4144-ProjectCode-Group-12.zip”. Finally, you need to submit your zip file via brightspace. d) Project Report: You need to convert your project report into a pdf file. The name of the pdf file should be “CSCI4144-ProjectReport-Group-YourGroupID.pdf”. For example, if the group ID is 12, then the pdf file should be named “CSCI4144-ProjectReport-Group-12.pdf”. The project report should be submitted via brightspace. 3. Grading Criteria The marker will use your submitted zip file (except presentation) to evaluate your assignment. The grade of the project presentation will be based on the in-class presentation and the submitted presentation slides. 1) Project Proposal (10 Points): a) Tentative title of the project (1 Point) b) Tentative selected research paper (1 Point) c) Problem description (6 Points) d) Project timeline (1 Points) e) Group Members (1 Point) 2) Project Presentation (7 Points): a) Content (3 Points): a. Background b. Algorithm Description c. Implementation Description d. Experimental Results b) Clarity (3 Points): a. Logical and systematic development of ideas b. Precise use of formal language c) Timing (1 Point): a. Time is well distributed over varied components b. Presentation is completed on time 3) Project Code (10 Points): 5 a) Does “Readme.txt” include enough information so that the TA can easily compile and execute the program on bluenose? (1 Point) b) Can the submitted program be executed on bluenose to generate the experimental results? (6 Points) c) Overall design of the program (3 Points) 4) Project Report (20 Points): a) Cover page (1 Point) b) Introduction (2 Point) c) Algorithm (4 Points) d) Data Preparation and Algorithm Implementation (4 Points) e) Experimental results (4 Points) f) Conclusion (1 Point) g) List of References (1 Point) h) Overall report quality (3 Points) 4. Academic Integrity At Dalhousie University, we respect the values of academic integrity: honesty, trust, fairness, responsibility and respect. As a student, adherence to the values of academic integrity and related policies is a requirement of being part of the academic community at Dalhousie University. 1) What does academic integrity mean? Academic integrity means being honest in the fulfillment of your academic responsibilities thus establishing mutual trust. Fairness is essential to the interactions of the academic community and is achieved through respect for the opinions and ideas of others. Violations of intellectual honesty are offensive to the entire academic community, not just to the individual faculty member and students in whose class an offence occur (See Intellectual Honesty section of University Calendar). 2) How can you achieve academic integrity? – Make sure you understand Dalhousie’s policies on academic integrity. – Give appropriate credit to the sources used in your assignment such as written or oral work, computer codes/programs, artistic or architectural works, scientific projects, performances, web page designs, graphical representations, diagrams, videos, and images. Use RefWorks to keep track of your research and edit and format bibliographies in the citation style required by the instructor. (See http://www.library.dal.ca/How/RefWorks) – Do not download the work of another from the Internet and submit it as your own. – Do not submit work that has been completed through collaboration or previously submitted for another assignment without permission from your instructor. – Do not write an examination or test for someone else. – Do not falsify data or lab results. 6 These examples should be considered only as a guide and not an exhaustive list. 3) What will happen if an allegation of an academic offence is made against you? I am required to report a suspected offence. The full process is outlined in the Discipline flow chart, which can be found at: http://academicintegrity.dal.ca/Files/AcademicDisciplineProcess.pdf and includes the following: a. Each Faculty has an Academic Integrity Officer (AIO) who receives allegations from instructors. b. The AIO decides whether to proceed with the allegation and you will be notified of the process. c. If the case proceeds, you will receive an INC (incomplete) grade until the matter is resolved. d. If you are found guilty of an academic offence, a penalty will be assigned ranging from a warning to a suspension or expulsion from the University and can include a notation on your transcript, failure of the assignment or failure of the course. All penalties are academic in nature. 4) Where can you turn for help? – If you are ever unsure about ANYTHING, contact myself. – The Academic Integrity website (http://academicintegrity.dal.ca) has links to policies, definitions, online tutorials, tips on citing and paraphrasing. – The Writing Center provides assistance with proofreading, writing styles, citations. – Dalhousie Libraries have workshops, online tutorials, citation guides, Assignment Calculator, RefWorks, etc. – The Dalhousie Student Advocacy Service assists students with academic appeals and student discipline procedures. – The Senate Office provides links to a list of Academic Integrity Officers, discipline flow chart, and Senate Discipline Committee.

admin

Author admin

More posts by admin