Skip to main content
留学咨询

辅导案例-BISM7217-Assignment 1

By May 15, 2020No Comments

BISM7217 – 2020 S1 – Assignment 1 1 Advanced Business Data Analytics BISM7217 ASSIGNMENT 1 BISM7217 – 2020 S1 – Assignment 1 2 Summary • Type: Project report • Learning Objectives Assessed: 1, 2, 3, 4, 5 • Due Date: 20 Apr 2020 11 AM • Deliverable: A written report submitted via TurnItIn and a RapidMiner process • Weight: 30% This assignment is an individual assignment. The aim is to provide experience in the steps involved with creating, evaluating, improving classification models, and finally presenting and interpreting the model in a business report. You are strongly encouraged to commence this assignment by the end of the third week of the semester, and you should progress thoughtfully through the steps. Hasty decisions made early in the design process may result in much more work later. Feel free to discuss concepts and ideas with peers, but remember your submission must be your work. Be careful not to allow anyone to copy your work. Specification Direct marketing is a form of advertising which allows organizations to communicate directly to customers through a variety of media, including phone cell calls and emails. As selecting the best set of clients, i.e., that are more likely to subscribe a product, is a complex task (Nobibon et al., 2011), various technologies should be employed to improve marketing by focusing on specific customers, thus allowing companies to build more extended relations aligned with their business strategies (Rust et al., 2010). Centralizing customer remote interactions in a contact center eases operational management of campaigns, and communicating with customers through the telephone is one way to conduct direct marketing activities (Moro et al., 2014). Marketing operationalized through a contact center is called telemarketing (Kotler et al., 2009). In the banking industry, deciding on the target customers for telemarketing is of crucial importance, under a growing pressure to increase profits and reduce costs. Banks are now pressured to increase capital requirements in various ways, including capturing more long term deposits (Moro et al., 2014). Under this context, the use of predictive modeling based on a previous data to predict the result of a telemarketing phone call to sell long term deposits is a valuable tool to support client selection decisions of bank campaign managers. As an analyst in BOP, a Portuguese bank, you are going to propose a classification model that can predict the result of a phone call to sell long term deposits. Such a model is valuable to assist managers of BOP bank in prioritizing and selecting the next customers to be contacted during bank marketing campaigns. Your model will help managers, including the Director of BISM7217 – 2020 S1 – Assignment 1 3 Market Intelligence, to analyze the probability of success. Consequently, the time and costs of such campaigns would be reduced, and by performing fewer and more effective phone calls, client stress and intrusiveness would be diminished. Dataset The data is related to direct marketing campaigns of BOP bank. The marketing campaigns were based on phone calls. Often, more than one contact with the same client was required to assess if the product is of interest to a customer. The provided dataset contains 41188 records and 20 inputs, ordered by date (from May 2008 to November 2010). The classification goal is to predict if the client will subscribe (yes/no) a term deposit (subscription variable). There are 4 types of input variables and only 1 target/label/special variable: A) Bank client data: 1. Age (type: numeric) 2. Job: type of job (type: categorical) 3. Marital: marital status (type: categorical) 4. Education (type: categorical) 5. Default: has credit in default? (type: categorical) 6. Housing: has a housing loan? (type: categorical) 7. Loan: has a personal loan? (type: categorical) B) Related with the last contact of the current campaign: 1. Contact: contact communication type (type: categorical) 2. Month: last contact month of the year (type: categorical) 3. Day_of_week: last contact day of the week (type: categorical) 4. Duration: last contact duration, in seconds (type: numeric). Important note: The duration attribute profoundly affects the output target. For example, if the duration is ZERO, then y would be most likely “NO”. Yet, the duration is not known before a call is performed. Also, after the end of the call y is known. Thus, you should discard the duration attribute if you intend to have a realistic predictive model. C) Other attributes: 1. Campaign: number of contacts performed during this campaign and for this client (type: numeric) Note: This attribute includes the last contact. 2. Pdays: number of days that passed by after the client was last contacted from a previous campaign (type: numeric) Note: 999 means the client was not previously contacted. 3. Previous: number of contacts performed before this campaign and for this client (type: numeric) 4. Poutcome: outcome of the previous marketing campaign (type: categorical) D) Social and economic context attributes 1. Emp.var.rate: employment variation rate – quarterly indicator (type: numeric) BISM7217 – 2020 S1 – Assignment 1 4 2. Cons.price.idx: consumer price index – monthly indicator (type: numeric) 3. Cons.conf.idx: consumer confidence index – monthly indicator (type: numeric) 4. Euribor3m: euribor 3 month rate – daily indicator (type: numeric) 5. Nr.employed: number of employees – quarterly indicator (type: numeric) E) Output variable (desired target): • Subscription: indicates if the client subscribed to a term deposit (type: binary) Deliverables Your reports should include the following parts: • Executive summary: Include those results that are most significant for your strategy development and recommendations and justify them. • Introduction or data exploration • Model building. • Model evaluation It is up to you to decide what proportion of your report goes to each part. You may include tables, charts, or tables of your analysis and models. At the end of your analysis, your RapidMiner process should be exported to your desktop or laptop in .rmp format and then uploaded along with your report. The consistency of your .rmp file will be checked with the results in your report. You do not need to provide the screenshots of your RapidMiner process, as the marker can observe them from your .rmp file. Consider the following points for designing your process: • You need to create only one .rmp file with as many operators and outputs that are needed. • You should not modify “BISM7217_2020_S1_A1_Data.xlsx” file before importing it in RapidMiner. • All of your analysis should be done after importing “BISM7217_2020_S1_A1_Data.xlsx” in RapidMiner, not Excel, or any other analytical tool. • Process should start with loading “BISM7217_2020_S1_A1_Data.xlsx” file from your desktop. Formatting and professionalism The project report is to be written to a professional standard. This requires a formal writing style – do not use dot points – and adopt a professional tone. Given the report’s nature, you may choose to write this essay in the first person. The report must be consistent with the University’s policies on academic integrity, plagiarism, and consequences as noted below. The report should be typed (in Times Roman 12-point font or larger, single-spaced) and the Word Count should be 1500 words (+/- 10%) in total length. The Word Count excludes the title page, tables, footnotes and references (if required). The word limit must be observed or the assessment will be affected as noted in the Rubric. No appendices are to be provided. BISM7217 – 2020 S1 – Assignment 1 5 Submission To be done through Blackboard Assignment Submission and TurnItIn as indicated in Learn.UQ. Acceptable submission formats are Microsoft Word and PDF formats for the reports and .rmp for the process. The files MUST be named in the format o
f BISM7217_StudentLastName_StudentID.pdf (or a. docx or .doc extension). If your ID is 41724593 and your surname is Mory, the name of your files would be BISM7217_Mory_41724593.pdf. The written assignment file should not be zipped. Plagiarism It is understandable that students talk with each other regularly, and discuss problems and potential solutions. However, it is expected that the submitted assignment is a unique document – all parts of the assignment are to be completed solely by the individual student. In cases where an assignment is perceived to not be a unique work, a loss of marks and other implications can result. For further information about academic integrity, plagiarism and consequences, please visit: http://ppl.app.uq.edu.au/content/3.60.04-student-integrity-and-misconduct. Frequently Asked Questions Question: How can I format my report? Answer: The most common approach is considering 4 parts: 1) Executive summary, 2) Introduction 3) Model building and 3) Model evaluation. You may wish to other sections such as Conclusion (Optional) or References (Optional). Question: What should I include in ES? Answer: Executive Summary (ES) is the essence of your work that should be very brief. Since your report is a maximum of 1500 words, it is better no to aim for more than 200 words for ES, but again it is your choice, and it is essential to provide a quality and persuasive report. Question: What can I discuss in the model building section? Answer: You can discuss the following items c in this section: How you build various models? If you changed the parameter, and why? Did you try to improve your models, and how? Could you improve your models? Question: What should I include in the model evaluation section? Answer: How did you evaluate your models? What metrics you used, and why? Which model performed better, and why you think so? Can you rely on your results, and why? Question: What are the expectations when describing a Decision Tree (DT)? Do we need to talk about every branch? Answer: The advantage of DTs is that they are very intuitive, and you can interpret them by elaborating on their branch. So, yes, but you do not need to elaborate on all of them. You can pick some more indicative ones and elaborate on them. You can use model improvement techniques, such as AdaBoost, Bagging, and Random Forests, along with decision trees and also elaborate on them too. Question: Do I have to have all the DTs with different configurations/and different model improving methods in the .rmp file, to show how I tried different modeling? Or is it ok to have only the models that I am satisfied with and that I decide to use in my report? Answer: You can only submit the process of the models that you discuss in your report. But it worth mentioning in your report the additional work you havev done. Question: How can I export the figures generated in RapidMiner to my report? Answer: You can use windows snipper. Question: Which one is more important, accuracy, or presentation? And how high accuracy we are expected to reach? Answer: Your approach, the undertaken steps, and their justification are more important than the final accuracy level. You need to show that you tried your best, but if available data is not enough for achieving higher accuracy, it is not your fault. It is the maximum that we can learn from the available data. Question: Can I upload as many as rmp files? Answer: We prefer only one process. Question: Whenever I choose the export process from the File menu and save the process on my computer, I am unable to find it? Answer: Make sure to choose your desktop while exporting rmp file. Administrative Requirements Consultation sessions To ensure that an equal and sufficient amount of time is allocated for every student who attends consultation sessions regarding the practical aspects of BISM7217, the average consultation time (during busy consultation times) will be limited to 5 minutes per student. The main aim of this restriction during busy periods is to ensure student equity and minimise waiting time. However, in circumstances where no other students are waiting, longer consultation times will be provided. Tutors will advise you of their consultation times during tutorials – these details are also available on the BISM72117 Blackboard site under “Contacts”. Submission Date BISM7217 – 2020 S1 – Assignment 1 7 11 AM 9th April 2020 For each calendar day (i.e. including Saturdays and Sundays) or part thereof after the submission deadline, a penalty of 5% of the total possible assignment marks will be deducted until the assignment is submitted. Deadline extensions An extension to the assignment deadline will only be considered for legitimate reasons and with supporting documentation (e.g. medical certificate). A request for an extension is assessed by the Assessment, Examinations & Misconducts Coordinator. You may discuss your situation with your course coordinator, but you still need to make a formal extension request using the form identified on the Electronic Course Profile for this course. Extensions will not be granted where the School is not satisfied; you took reasonable measures to avoid the circumstances that contributed to you not submitting by the due date. The following are not grounds for an extension: • holiday arrangements (including overseas travel) • misreading a due date • social and leisure events • moving house • the pressure of work/competing deadlines • computer issues Please refer to the Electronic Course Profile for this course for more detail. Marking rubric Your report will be graded on its structure, rationale, arguments, use of academic support/sources, and overall presentation quality. This assignment is worth 20 marks. The marking rubric on the next page is designed to reflect a marking schema of 100 points that are scaled back to 20 marks. Part marks are rounded up or down to the nearest half mark. BISM7217 – 2020 S1 – Assignment 1 8 Very poor Below Expectations Meets Expectations Good Very Good Outstanding Identify the problem space. 20 marks Provides no evidence of imitative in planning and identifies no viable approaches for creating predictive models. Provides little evidence of imitative in planning and identifies a few viable approaches for exploring data. Demonstrates satisfactory initiative in planning and Identifies basic multiple approaches for exploring data. Demonstrates good initiative in planning and Identifies adequate multiple approaches for exploring data. Demonstrates a very good initiative in planning and identifies reasonable multiple approaches for exploring data. Demonstrates exemplary initiative in planning and identifies advanced multiple approaches for creating predictive models exploring data. Analysis 30 marks Summarize data rather than predictive modelling; never identifies patterns. Summarize data with very little predictive modelling; rarely identifies patterns. Reveal some patterns. It reflects a proficient level of judgement of set-up of the various developed models. Reveal obvious patterns. It reflects a generally sound level of judgement of set-up of the various developed models. Synthesise patterns, differences or similarities. It reflects a mostly high level of judgement of the set-up of various developed models. Reveal insightful patterns, differences or similarities. It reflects a consistently high level of judgement of the set-up of various developed models Propose Solutions 25 marks Proposes no or inadequate solutions that indicate obvious lack of comprehension of the developed models. Proposes few solutions that indicate very little comprehension of the developed models. Proposes one or more acceptable creative solutions that indicates satisfactory comprehension of the problem. Proposes one or more good creative solutions that indicates comprehension of
the problem. Proposes one or more very good creative solutions that indicates a deep comprehension of the problem. Proposes one or more exemplary creative solutions that indicates a deep comprehension of the problem. Evaluate Potential Solutions 25 marks Evaluation of solutions is superficial lacking any consideration for the developed models, with little or no logical judges the pros and cons of developed model. Evaluation of solution is partial lacking consideration for the developed models, with little or no logical judges the pros and cons of developed model. Evaluation of solutions satisfactorily considers the developed models, logically judges the pros and cons of developed model. Evaluation of solutions fairly considers the developed models, logically judges the pros and cons of developed model. Evaluation of solutions contains mostly thorough and includes reasonably thorough consideration of developed models, logically judges the pros and cons of developed model. Evaluation of solutions contains consistently thorough and insightful explanation and includes quite thorough consideration of the developed models, logically judges the pros and cons of developed models.

admin

Author admin

More posts by admin