- July 5, 2020

Semester One Final Examinations, 2017 INFS7901 Database Principles Page 1 of 12 This exam paper must not be removed from the venue School of Information Technology and Electrical Engineering EXAMINATION Semester One Final Examinations, 2017 INFS7901 Database Principles This paper is for St Lucia Campus students. Examination Duration: 90 minutes Reading Time: 10 minutes Exam Conditions: This is a School Examination This is a Closed Book Examination – specified materials permitted During reading time – write only on the rough paper provided This examination paper will be released to the Library Materials Permitted In The Exam Venue: (No electronic aids are permitted e.g. laptops, phones) Calculators – Any calculator permitted – unrestricted One A4 sheet of handwritten or typed notes double sided is permitted Materials To Be Supplied To Students: 1 x 14 Page Answer Booklet 1 x Multiple Choice Answer Sheet Instructions To Students: Additional exam materials (eg. answer booklets, rough paper) will be provided upon request. Venue ____________________ Seat Number ________ Student Number |__|__|__|__|__|__|__|__| Family Name _____________________ First Name _____________________ For Examiner Use Only Question Mark Total ________ Semester One Final Examinations, 2017 INFS7901 Database Principles Page 2 of 12 Module 1 Question 1. (10 Marks) Functional Dependencies and Normal Forms Consider the relation instance on the right-hand side. Observe that B → A appears to hold with respect to the given instance. a. (4 marks) Check to see if all of the following dependencies hold with respect to the instance and explain why: A B C John 1 10 John 2 11 Jane 3 11 Jane 3 11 Jill 4 12 Jill 5 13 b. (2 marks) Determine the minimum number of tuples that can be added to the above instance to invalidate B → A. c. (4 marks) Assuming that B → A holds in the given schema. Decompose it, if necessary, so that all the resultant relations are in BCNF. A → B B → C Semester One Final Examinations, 2017 INFS7901 Database Principles Page 3 of 12 Question 2. (10 Marks) Relational Algebra Answer the following questions using queries in relational algebra. The following schema introduces an example, concerning high-school students applying for college. It involves the following relations: • Student (sID, sName, age, GPA, sizeHS) • College (cName, state, enrolment) • Apply (sID, cName, major, decision) The relation Student records the id, name, age, GPA, and size of the high school of the student. The relation college records the name of the college, the state in which the college is, and the current number of students that are attending the college. Finally, the Apply relation keeps track of application and the decision that was made by the college. a) (5 marks) Find the name of colleges that all of the students have applied to. b) (5 marks) Find the id of students have been accepted in the CS major of both Stanford and MIT. Semester One Final Examinations, 2017 INFS7901 Database Principles Page 4 of 12 Question 3. (10 Marks) SQL Answer the following questions using queries in SQL. For this question, we will be using the schema that was introduced in Question 2: • Student (sID, sName, age, GPA, sizeHS) • College (cName, state, enrolment) • Apply (sID, cName, major, decision) a) (4 marks) Find IDs of students that are not from the smallest high school. b) (6 marks) Find the id(s) and name(s) of the youngest student(s) that have been accepted in the CS major in each of the states. Semester One Final Examinations, 2017 INFS7901 Database Principles Page 5 of 12 Module 2 Question 4. (10 Marks) Asymptotic Analysis Assume that you are asked to make a recommendation between two algorithms (A and B) that accomplish exactly the same task. After measuring the run times for various inputs sizes, you determine that the average running time (for input of size n) of algorithm A is TA(n) = 3n2, and the average running time of algorithm B is TB(n) = 4n + 100. a) (2 marks) For n =10, which algorithm would run faster? Explain your answer b) (2 marks) For n = 1000, which algorithm would run faster? Explain your answer c) (2 marks) Determine the big-o notation of each of the algorithms. d) (4 marks) Which algorithm would you recommend? Refer to your answers from part a, b, and c in your recommendation. Semester One Final Examinations, 2017 INFS7901 Database Principles Page 6 of 12 Question 5. (6 Marks) Sorting In class, a variety of sorting algorithms were examined. For parts (a) to (c), choose the best algorithm from the ones we have examined. Justify your answer (in a sentence or two) referencing the running time of the algorithms. a) (2 marks) You are a data analyst and you are trying to put an input of a few million numbers into order. You are aware that the input is almost sorted and there may only be a few numbers that you need to move around. Which sorting algorithm would you use? b) (2 marks) You are sorting a few million numbers in a cloud environment in which read operations take O(1), constant time, and write operations take O(n), linear time. Which sorting algorithm would you use? c) (2 marks) You are part of an analysis team that is operating under very strict and sensitive timelines, so you want to be able to guarantee that the sorting algorithm that you use runs in O(n log n) in the absolute worst case. What sorting algorithm would you use? Question 6. (4 Marks) Sorting An array is supplied to the partition routine used by quicksort(). After the very first call to partition() the array contains the following numbers.: what number(s) in the array can be the pivot? Note the actual number(s) is wanted, not the position of the number(s) in the array. Semester One Final Examinations, 2017 INFS7901 Database Principles Page 7 of 12 Question 7. (10 Marks) Binary Search Trees This question makes reference to the following Binary Search Tree: a) (1 mark) What is the height of this tree? b) (1 mark) Which node is the root node? c) (1 mark) What is the depth of node 19? d. (1 mark) Which nodes are leaves? e. (2 marks) When building a tree using a sequence of insertions (i.e. the insertion technique from class) which one of the following is the appropriate sequence of insertions to build the tree? I. 23, 60, 17, 80, 40, 19, 10, 90, 21, 11, 6, 81 II. 23, 81, 90, 80, 40, 60, 21, 19, 11, 6, 10, 17 III. 23, 17, 10, 6, 11, 19, 21, 40, 60, 80, 90, 81 IV. 6, 11, 10, 17, 21, 19, 23, 40, 60, 81, 90, 80 V. 6, 10, 11, 17, 19, 21, 23, 40, 60, 80, 81, 90 VI. 6, 11, 10, 21, 19, 17, 40, 81, 90, 80, 60, 23 VII. 81, 6, 11, 21, 90, 10, 19, 40, 80, 17, 60, 23 VIII. 81, 90, 21, 11, 6, 80, 40, 19, 10, 60, 17, 23 B → C B → C B → C Semester One Final Examinations, 2017 INFS7901 Database Principles Page 8 of 12 f. (2 marks) Draw the tree resulting from deletion of node with value 23, assuming that we are using the node’s successor to replace it. g. (2 marks) Suppose that we have numbers between 1 and 1000 in a binary search tree, and we want to search for the number 363. Which of the following sequence(s) could not be the sequence of nodes examined? I. 23, 60, 17, 80, 40, 19, 10, 90, 21, 11, 6, 81 II. 23, 81, 90, 80, 40, 60, 21, 19, 11, 6, 10, 17 III. 23, 17, 10, 6, 11, 19, 21, 40, 60, 80, 90, 81 IV. 6, 11, 10, 17, 21, 19, 23, 40, 60, 81, 90, 80 V. 6, 10, 11, 17, 19, 21, 23, 40, 60, 80, 81, 90 VI. 6, 11, 10, 21, 19, 17, 40, 81, 90, 80, 60, 23 VII. 81, 6, 11, 21, 90, 10, 19, 40, 80, 17, 60, 23 VIII. 81, 90, 21, 11, 6, 80, 40, 19, 10, 60, 17, 23 Semester One Final Examinations, 2017 INFS7901 Database Principles Page 9 of 12 Question 8. (6 Marks) Hashing Consider a hash table of size 7 with hash function h(k) = k mod 7. Draw the contents of the table after inserting, in the given order, the following values into the table: 9, 16, 27, 62, and 2. a) (3 marks) When chaining is used to resolve collisions 0 1 2 3 4 5 6 b) (3 marks) When double hashing with secondary hash function h2(k) = 5−(k mod 5) is used to resolve collisions. 0 1 2 3 4 5 6 Question 9. (4 Marks) Hashing For good hashing performance, we need to have a good hash function. Can we create an ideal hash function that would work well in all problems independent of the dataset? If so explain how, and if not explain why? Semester One Final Examinations, 2017 INFS7901 Database Principles Page 10 of 12 Module 3 Question 10. (10Marks) Indexing For this question, we will be using the schema that was introduced in Question two: • Student (sID, sName, age, GPA, sizeHS) • College (cName, state, enrolment) • Apply (sID, cName, major, decision) Suppose you know that the following queries are the most common queries in the database and that all are roughly equivalent in frequency and importance: 1. Find name, age, GPA, and the high school size of student based on their SID 2. Find name of colleges in each state 3. Find students IDS with GPA higher than 3.6 4. Find SID of students that have applied to each college These queries occur much more frequently than updates, so you should build whatever indexes you need to speed up these queries. However, you should not build any un- necessary indexes, as updates will occur (and would be slowed down by unnecessary indexes). Given this information, decide which attributes should be indexed and whether each index should be a clustered index or an unclustered index. Assume that both B+ trees and hashed indexes are supported by the DBMS and that both single- and multiple- attribute index search keys are permitted. Semester One Final Examinations, 2017 INFS7901 Database Principles Page 11 of 12 Question 11. (10 Marks) Query Optimization For this question, we will be using the schema that was introduced in Question two: • Student (sID, sName, age, GPA, sizeHS) • College (cName, state, enrolment) • Apply (sID, cName, major, decision) Consider the following query Select s.sID from Student s, College c, Apply a Where s.sID = a.sID and c.cName=a.cName and GPA>3.5 and c.cName=’stanford’ and decision=’Y’ a) (5 marks) Draw the initial query tree for this query and explain briefly why this plan is considered inefficient. Semester One Final Examinations, 2017 INFS7901 Database Principles Page 12 of 12 b) (5 marks) Transform the initial query into an equivalent final query tree that is efficient to execute. END OF EXAMINATION