Skip to main content
留学咨询

辅导案例-RSM312H1

By May 15, 2020No Comments

Data and Information Management for Business Analytics RSM312H1 Assignment – Building a Database and Information Management Solution Using Access, Excel and Tableau Assignment Objective: The objective of this assignment is to design, develop and implement a complete Business Analytical Environment to support an e-commerce retailer. This analytical environment is required to deliver the following functional capabilities: 1. Accept raw input source data 2. Store and Manage the business data in a Relational Database Management System 3. Support an Analytical Sandbox Environment by creating a linked data extraction layer 4. Provide and Analytical Sandbox to test and develop various scenario, what-if analysis capabilities 5. Provide an interactive data visualization layer to support data visualizations and performance monitoring dashboards of key performance indicators. The purpose of creating this analytical environment is to support the business decision making processes for this retailer and your solution will be evaluated on 2 dimensions. The first dimension will be based on the feature functionality, technical design, and usability of your implemented solution. The second evaluation criteria will focus on your solutions ability to help support the managerial problem that the e-commerce retailer is facing. By applying the information management and business analysis skillsets that have been taught in your course, you will be going through a comprehensive database design, development and implementation process, followed by the development on a data analysis and visualization environment that will significantly expand the decision support capabilities of the e- commerce retailer. Please note that the use of any pre-developed templates (for either the database, or spreadsheet components) is strictly prohibited. All work is expected to be original development. Use of ‘freeware’ or ‘shareware’ for any of the core design components will not be accepted. Background: Olist is one of the largest e-commerce websites in the Brazilian marketplace. Olist connects small businesses from all over Brazil to retail channels under a single contract and without hassles. Merchants are able to sell their products through the Olist eStore and ship them directly to the customers using Olist’s logistics partners. Since Olist’s business model is largely built on being an efficient facilitator of a transaction, any disruptions in timely and efficient delivery of customer orders immediately attract managerial attention. Any customer order when the delivery is not made within the quoted delivery time is regarded as a “fail”. Moreover, there is evidence that long delivery times negatively affect sales, even when delivery occurs within the promised times. Thus, Olist’s supply management team is interested in identifying and prioritizing any issues impacting customer order fulfillment. Olist has initiated a project whose goal is to identify factors, that are within Olist’s influence or control, and that have the potential to improve outcomes in terms of customer order fulfillment process. Olist is very early in their investigation process; you and your team have been retained as analytical consultants for this project. Some background for the impetus of this project was provided by AE, one of Olist’s senior managers. “Customers who did not get their orders delivered by the promised date are often quite unhappy, and are not shy in letting us know about it. We, in turn, flag these incidents to the partners who were involved in this order. They know that we are unhappy and if such incidents keep happening, we will review our relationship with them. As a result, the proportion of orders with missed delivery dates is quite low, though there may be some further avenues for improvement that data analysis will uncover. A more subtle issue is that we may have over-emphasized the importance of meeting promised delivery dates. I believe that some of our partners have responded by padding the estimated delivery times to give themselves a large margin of safety, rather than optimizing their delivery procedures. As a result, the estimated delivery time shown to the customer is often quite long. I firmly believe that many customers will not place an order when they see estimated delivery times measured in weeks, particularly when our competitors often promise delivery in a few days. It is hard to measure this effect: there are many reasons why a customer who gets to the order placement page does not place the order: some are just “window shopping” – they may not have had any real intention of buying in the first place, some may get “sticker shock” when the see the total bill and simply abandon the order. However, I suspect there are many customers who are discouraged to buy when they see a long estimated delivery time. Another thing I wonder about is our policy to always ship items in the same order together. We do it to minimize the shipping costs, both for us and for the customer. However, if different sellers are involved with different fulfillment timelines for their items, the whole order may be delayed because of issues with just one item. Should we not, at the very least, offer our customers an opportunity to split the order and have different parts delivered separately?” Olist Order Fulfillment Process A customer wishing to make a purchase on Olist site will select the items they want and place them in a shopping cart. Note that an order may contain several identical items, or several different items, possibly carried by different sellers. Once the customer decides to purchase, Olist system determines which partner would handle this order and contacts them to get an estimated delivery time. Each partner follows their own procedure to determine the estimated delivery time. Generally, they will look at order complexity (number of items and sellers involved), the shipping distance, as well as seller quoted delivery times – the number days within which the seller commits to having a given item ready for pickup by the partner. All of this has to happen in real time, as the customer has to see the estimated delivery time before the order is finalized. Once the estimated delivery time is shown to the customer and they decide to place the order, they enter their payment detail and a unique order id and a “purchase timestamp” are generated. Next the customer payment is processed. Once an approval for payment is received, a “payment approved” timestamp is generated and the order is passed on to the logistics partner. Once the partner acknowledges receiving the order, another timestamp, “order delivered carrier” is generated. The partner then contacts all the sellers involved in fulfilling this order and asks them to have their respective items ready for pickup by their quoted “seller limit” date. The items are then picked up, packaged together and shipped to the customer. Once the customer accepts the delivery, the final “order delivered customer” timestamp is generated. While the vast majority of the orders are eventually delivered to customers, exceptions do occur due to item unavailability, order cancellations, or other circumstances. These are reflected in the “order status” field. The dataset has information on approximately 100k orders from 2016 to 2018 made at multiple marketplaces in Brazil. After a customer purchases the product from the Olist Store, a seller gets notified to fulfill that order. Note: 1. An order might have multiple items. 2. Each item might be fulfilled by a distinct seller. The following data schema with field descriptions has been provided to allow you to data analysis solution. Olist Customers This dataset has information about the customer and its location. Each order is assigned to a unique customer_id. This means that the same customer will get different ids for different orders. The purpose of ha
ving a customer_unique_id on the dataset is to allow you to identify customers that made repeated purchases at the store. Otherwise you would find that each order had a different customer associated with. customer_id key to the orders dataset. Each order has a unique customer_id. customer_unique_id unique identifier of a customer. customer_zip_code_prefix first five digits of customer zip code customer_city customer city name customer_state customer state Olist Geolocation This dataset has information about Brazilian zip codes and its lat/lng coordinates. You can use it to plot maps and find distances between sellers and customers. geolocation_zip_code_prefix first 5 digits of zip code customer_lat latitude of centroid of customer’s 5-digit zip code area, decimal notation customer_lng longitude of centroid of customer’s 5-digit zip code area, decimal notation geolocation_city city name geolocation_state state Olist Order Items Dataset This dataset includes data about the individual items purchased within each order. Each order may include multiple items, possibly supplied by different sellers. This dataset will include a separate entry for each unique product included in the order. order_id order unique identifier order_item_id sequential number identifying the number of items within the same order. product_id product unique identifier seller_id seller unique identifier shipping_limit_date shows the seller shipping limit date for handling the order over to the logistic partner price item price in Brazilian reals, R$. Currently 1R$ = 0.32CAD$ freight_value item freight value item (if an order has more than one item the freight value is split between items) Olist Order Payments Dataset This dataset includes information about the orders payment options. order_id order unique identifier payment_sequential customer can pay with more than one payment method. If they do so, a sequence will be created to accommodate all payments. payment_type method of payment chosen by the customer payment_installments number of installments chosen by the customer payment_value transaction value Olist Order Reviews Dataset This dataset includes information about the reviews made by the customer. After a customer purchases the product from the Olist Store, a seller gets notified to fulfill that order. Once the customer receives the product, or the estimated delivery date is due, the customer gets a satisfaction survey by email where he can give a note for the purchase experience and write down some comments. review_id unique review identifier order_id unique order identifier review_score note ranging from 1 to 5 given by the customer on a satisfaction survey review_comment_title comment title from the review left by the customer (in Portuguese) review_comment_message comment message from the review left by the customer (in Portuguese) review_creation_date shows the date in which the satisfaction survey was sent to the customer review_answer_timestamp shows satisfaction survey answer timestamp Olist Orders This is the core dataset. From each order you might find all other information. order_id unique identifier of the order customer_id key to the customer dataset. Each order has a unique customer_id. order_status reference to the order status (delivered, shipped, etc.) order_purchase_timestamp shows the purchase timestamp order_approved_at shows the payment approval timestamp order_delivered_carrier_date shows the order posting timestamp. When it was handled to the logistic partner order_delivered_customer_date shows the actual order delivery date to the customer order_estimated_delivery_date shows the estimated delivery date that was informed to customer at the purchase moment Olist Products This dataset includes information about each product sold by Olist. product_id unique product identifier product_category_name root category of product, in Portuguese product_name_length number of characters extracted from the product name product_description_length number of characters extracted from the product description product_photos_qty number of product photos on Olist website product_weight_g product weight measured in grams product_length_cm product length measured in centimeters product_height_cm product height measured in centimeters product_width_cm product width measured in centimeters Olist Sellers Dataset This dataset includes data about the sellers that fulfilled orders made at Olist. Use it to find the seller location and to identify which seller fulfilled each product. seller_id seller unique identifier seller_zip_code_prefix first 5 digits of seller zip code seller_city seller city name seller_state seller state NOTE: Category Name Translation Product_category_name_translation.csv contains two columns, the first column (product_category_name) contains the category name in Portuguese and the second column (product_category_name_english) contains the translated category name in English. NOTE: Calculating distances from coordinates The locations for customers and sellers are provided in decimal coordinates (https://en.wikipedia.org/wiki/Decimal_degrees), which you may want to convert to distances. While there are many ways to accomplish this simple task, the following formula will do the trick: Dist (in km) = ACOS(COS(RADIANS(90-Lat1)) *COS(RADIANS(90-Lat2)) +SIN(RADIANS(90-Lat1)) *SIN(RADIANS(90-Lat2)) *COS(RADIANS(Long1-Long2))) *6371 The Business Objectives As described in the case, Olist has initiated a project whose goal is to identify factors, that are within Olist’s influence or control, and that have the potential to improve outcomes in terms of customer order fulfillment process. Olist is very early in their investigation process; you and your team have been retained as analytical consultants for this project. Several potential issues related to the order fulfillment process and their impact on customer satisfaction and order placement are described in the case. Your task is to design and create both a data management and analysis environment to help support these issues on an ongoing basis. It is not necessary to try to address every aspect of the managerial problem in your project. However, its focus should have clear managerial relevance, and its findings should, at least potentially, provide valuable insights for Olist’s management. Defining a specific focus for your project is part of your task. Too narrow a focus may make findings less relevant as important parts of the managerial setting may be ignored. On the other hand too wide a focus may make the task intractable or infeasible. Deliverables: In this assignment you will be delivering an integrated solution across 3 application environments; 1. A relational database (using Access) 2. An analytical environment (using Excel and/or Python) 3. A data visualization environment with an interactive performance monitoring dashboard (using Tableau and/or Excel) Part 1. Designing and Building a Database You will be required to develop a physical database design in Access and import the raw csv datasets. Using the Database Relationship View, all tables must be interconnected by the appropriate 1 to many relationships. Standalone tables or tables with 1 to 1 relationships should not be incorporated into your design (please check with your TA if you feel that an exception is valid) Tables in your database must be given understandable names and descriptions. Variables (or columns) in the tables must be defined across appropriate data types including; text, date, number. Each table must have a primary key (or composite primary keys) defined. Once you’ve designed and populated your initial database you will need to evolve it into a working database that will demonstrate the following functional components that have been introduced in the Access portion of the course including; – At least 5 contrasting database administrative reports that can be used to summarize measu
rement data across a range of dimensions in order to assess data quality (i.e. Monthly Order Summary reports, Seller Reports, Product Performance Reports, Customer/Product Sales Reports, … etc.) to display your data in an attractive and professional manner. Reports should show a broad range of techniques and data views, and demonstrate different layout formats, and be contrasting from each other in both function and format. – At least 2 or more forms to facilitate data entry or updating for; orders, products, customers and suppliers. Forms must be show contrasting design layout approaches – Demonstrated capability to create detailed and summary SQL data extraction and data transformation queries that can be used to directly link (i.e. using a data connection) into your Excel Analytical environment as base input tables to support your analysis. You can also use these ETL queries to feed into your Visualization Environment in Tableau. Submit the completed database as an access database file attachment called XX_Group_Project_Database.accdb where XX is replaced by your group number (Please note: please ensure that original csv files are copied into your Access Database (i.e. not embedded as linked files) Part 2. Building an Analytical Environment in Excel Now that you have created a data management solution that is able to manage and retrieve Olist’s critical information, your next task is to focus on developing an analytical environment that takes data directly from the Access database. Your Excel solution must demonstrate the following capabilities: – Produce supporting analytical charts and graphs for senior management – Demonstrate the ability to pivot data in both a chart and graphical format – Through the use of formulas, take basic level data from the access database to create derived variables to help support advanced analysis – Within the appropriate context of your business problem, deploy and demonstrate the use of advanced functions within Excel including a minimum of one example that derives additional data elements from at least FIVE of each of the following categories: i. Mathematical or Statistical ii. Financial iii. Character and Text Manipulation iv. Database functions v. Logical functions vi. Lookup reference vii. Date Time manipulation – Create a series of analysis sheets that appropriately demonstrates A MINIMUM of THREE of the following analytical techniques covered in the course: Scenario Manager, Data Tables, Goal Seek, Solver, or a Statistical Analysis add in package. Capability must be demonstrated within the context of your business problem and data. – Your Excel analytical environment must contain an external data connection to your Access database to bring the underlying data from your database into the spreadsheet as a dynamic datalink. Submit the completed spreadsheet as an Excel file attachment called XX_Group_Project_Dashboard.xls where XX is replaced by your group number. You will also need to include your access database as XX_Group_Project_Database.accdb so that the Excel reference to the external data link can be found. Part 3. Building a Visual Analysis and Dashboard Environment in Tableau Now that you’ve developed your analytical environment and have been able to find insights and develop recommendations to help Olist better understand and manage it’s delivery and logistics challenges, the Olist team has requested your assistance in developing a data visualization environment that they can use to visualize how their business is performing and keep track of the key performance indicators through some form of a dashboard to help keep them on top of the logistical and delivery issues that they are faced with. Using Tableau Desktop, you have been tasked to develop a more Executive Level summarization of the key graphics, that includes a dashboard. In addition to the dashboard, the management team has requested a brief presentation/story using Tableau that points out the key trends and observations based on the analysis that was used to support your recommendations. Make sure that your story has a combination of; graphics, text and interactions that will allow the senior management team to get a better sense of the issues and concerns that need to be monitored on an ongoing basis. Submit your Tableau files as a published ‘read only’ Tableau packaged workbook (with your dataset embedded) that can be distributed through the Tableau Reader called XX_Assignment_3_Part_4_Dashboard.twbx where XX is replaced by your team number. (Please Note: DO NOT submit your original Tableau file (with the .twb extension) – you must do a File ….. Save As …… and select the Save as type: Tableau Packaged Workbook (*.twbx)) Evaluation Criteria Criteria Novice Proficient Database Environment Some issues with data loading, completeness of data. Unclear or incorrectly specified relationships between entities. ETL queries insufficient to provide a full view of the data for the analytical environment. Forms are simplistic and repetitive in structure and not fully considered for the intended purpose. Reports do not provide a clear or complete picture of the overall consistency/quality/integrity of the data environment. All tables fully populated with appropriately modelled entity relationships. ETL Queries are complementary to each other and illustrate a range of analytical views of the data from summary to detail across a range of dimensions. Appropriate use of Forms to support updating of key information into the database. Forms are easily interpretable and ensure that relational integrity is managed when adding or updating records. Range of contrasting reports to ensure that data integrity is in place such as checking key metric totals and breakdowns by common business dimensions. Analytical Environment Unclear line of inquiry in the analysis or difficult to follow due to lack of documentation or explanation, or poorly designed sheet layout. Most Excel derived or transformed data elements demonstrated, but more examples would be of further benefit to the analysis. Insufficient demonstration of analytical techniques in either number or quality of the inquiry. Well organized data linkages between underlying data sheets and the database. Methodical and easy to follow structure of analysis that is well documented and commented. Ease of readability and interpretability of analysis pages. Full and clear demonstration of various Excel derived data capabilities. Demonstration of minimum 3 required analytical techniques from; Scenario Manager, Data Tables, Goal Seek, Solver, or Statistical Analysis add in package. Data Visualization Environment A broader range of visualization techniques should be demonstrated. Multiple aspects of the 4Cs are missing, or have not been well addressed in the visualization. Study is required to interpret the data and how it applies to the thesis of the analysis. There is a clear message or story conveyed, but the action or conclusion is not definitive. May require interpretation. Chart types chosen are acceptable, but axes may be cluttered or have rotated text. Color choices communicate meaning but can be improved. Delivery provides some form of argument and is partially supported; minor details should be vetted and affirmed, or there may be a lingering lack of clarity and work may be required to review and confirm. The visualization suggests some possibilities, but does not lead to clarity of understanding and therefore action is not possible. The presentation and communication leaves concerns or lingering lack of clarity. Work required to review and confirm. Methodical structure of data linkages into underlying charts, graphs, dashboards and stories. Broad range of visualization techniques demonstrated. The 4Cs are well represented; the visualization is clear, clean, concise, and captivating. The visualization facilitates quick cognition and leading to a fact-based
conclusion or assertion. The visualization is targeted to the audience, the story is evident, and the conclusion or action required is clearly apparent. Chart types are suitable and best options for the analysis. All axes and text are treated appropriately. The application of color is correct and clearly conveys meaning. The visualization (or presentation) is delivered in a convincing way that demonstrates confidence, competency, and thoroughness. The visualization is targeted to the audience, the story is evident, and the conclusion or action required is clearly apparent. The visualization (or presentation) is delivered in a convincing way that demonstrates confidence, competency, and thoroughness.

admin

Author admin

More posts by admin