- May 15, 2020
CSCA08H Assignment 1 Deadline: Tuesday October 1 2019 by 4:00pm Wednesday Oct 2 2019 at 6:00 p.m. (to accommodate those of us attending the Global Climate Strike) Late policy: There are penalties for submitting the assignment after the due date. These penalties depend on how many hours late your submission is. Please see the course website for more information. Goals of this Assignment Use the Function Design Recipe to plan, implement, and test functions. Write function bodies using variables, numeric types, strings, and conditional statements. (You can do this whole assignment with only the concepts from Weeks 1, 2, and 3 of the course.) Learn to use Python 3, Wing 101, provided starter code, a checker module, and other tools. Tweet Analyser This assignment is based on the social network company Twitter. Twitter allows users to read and post tweets that are between 1 and 280 characters long, inclusive. In this assignment, you will be writing functions that (we imagine) are part of the programs that manage Twitter feeds. Here are some example tweets: Standing ovation as Setsuko Thurlow is awarded a Doctor of Laws degree, honoris causa, by the University of Toronto @UofT for her tireless nuclear disarmament work and contributions to the Treaty on the Prohibition of Nuclear Weapons with @nuclearban ICAN Congratulations to our class of 2019 #UofTGrad19 #UofT’s @ProbabilityProf @UofTStatSci created a mathematical model at the start of the playoffs to figure out the team’s odds of winning. He predicts their home-court advantage will give them an edge. http://bit.ly/ProbProf Some terminology We will use the following terms in this assignment. tweet: A message posted on Twitter. For our assignment, a valid tweet is between 1 and MAX_TWEET_LENGTH characters long (inclusive). MAX_TWEET_LENGTH is a constant. tweet word: A word in a tweet. For our assignment, a valid tweet word contains only alphanumeric characters and underscores. For example, pink_elephant is a valid tweet word, while bits&pieces is not (In fact, bits&pieces has two valid tweet words, bits and pieces, with an ampersand (&) between them.) hashtag: A word in a tweet that begins with the hash symbol. Twitter uses the number sign (#) as the hash symbol. For our assignment, we’ll use the constant HASHTAG_SYMBOL to represent the hash symbol. Hashtags are used to label important words or terms in a tweet. A valid hashtag has the hash symbol as its first character and the rest of the characters form a valid tweet word. In other words, a hashtag begins with the hash symbol, and contains all alphanumeric characters and underscores up to (but not including) the first non-alphanumeric character (such as space, punctuation, etc.) or the end of the tweet. A hashtag must contain at least one alphanumeric character. #UofT, #cscA08, and #Go_Raptors are three examples of hashtags on Twitter. Note that a hashtag is not a valid tweet word, because it has the hash symbol as its first character. mention: A word in a tweet that begins with the mention symbol. Twitter uses the at-sign (@) as the mention symbol. For our assignment, we’ll use the constant MENTION_SYMBOL to represent the mention symbol. Mentions are used to direct a message at or about a particular Twitter user, so the word should be a Twitter username (but for the purposes of this assignment, we will not check if the word that follows the MENTION_SYMBOL is a real username — we’ll just assume it). For our purposes, the definition of a mention is very similar to that of a hashtag. A valid mention has the mention symbol as its first character and the rest of the characters form a valid tweet word. In other words, a mention begins with the at- sign, and contains all alphanumeric characters and underscores up to (but not including) the first non-alphanumeric character (such as space, punctuation, etc.) or the end of the tweet. A mention must contain at least one alphanumeric character. @redcrosscanada, @UN_Women, and @UofTGrad2019 are three examples of Twitter mentions. Note that a mention is not a valid tweet word, because it has the mention symbol as its first character. Here are some more interesting examples of how we will treat valid tweet words, hashtags, and mentions in this assignment. In the tweet Raptors win championship,#NBAFINALS, Go @Raptors!!! #WeTheNorth we have four valid tweet words (Raptors, win, championship, and Go), two hashtags (#NBAFINALS and #WeTheNorth), and one mention (@Raptors). It is important to note that in this example there is no space between the first comma and the hashtag #NBAFINALS, there is a comma immediately following the hashtag #NBAFINALS, there are three exclamation marks immediately following the mention @Raptors, and there are more than one space after the exclamation marks. All these are valid in a tweet. Also note that the first occurrence of the word Raptors is not considered to be a mention, because it does not have the mention symbol. In the tweet @UofT welcomes its 2019 graduates! #UofTGrad2019#graduation! we have four valid tweet words (welcomes, its, 2019, and graduates), two hashtags (#UofTGrad2019 and #graduation), and one mention (@UofT). It is important to note that in this example there is no space between hashtags #UofTGrad2019 and #graduation. This is also valid in a tweet. Some more obscure yet valid examples: In something#something_else we consider something is a valid tweet word and #something_else is a hashtag. In [email protected]#whatsoever?! we consider no is a tweet word, @spaces is a mention, and #whatsoever is a hashtag. For a complete list of Twitter terms, check out the Twitter glossary. Starter code For this assignment, we are giving you some files, including a Python starter code files. Please download the Assignment 1 Files and extract the zip archive. Starter code: tweet.py This file contains some constants, the header and the complete docstring (but not body) for the first function you are to write. Your job is to complete this file. Checker: a1_checker.py We have provided a checker program that you should use to check your code. See below for more information about a1_checker.py. Constants Constants are special variables whose values do not change once assigned. A different naming convention (uppercase pothole) is used for constants, so that programmers know to not change their values. For example, in the starter code, the constant MAX_TWEET_LENGTH is assigned the value 50 at the beginning of the module and the value of MAX_TWEET_LENGTH should never change in your code. When writing your code, if you need to use the value of the maximum tweet length, you should use MAX_TWEET_LENGTH. The same goes for the other constant values. Using constants simplifies code modifications and improves readability. If we later decide to use a different tweet length, we would only have to change the length in one place (the MAX_TWEET_LENGTH assignment statement), rather than throughout the program. What to do In the starter code file tweet.py, complete the following function definitions. Use the Function Design Recipe that you have been learning in this course . We have included the type contracts in the following table; please read through the table to understand how the functions will be used. We will be evaluating your docstrings in addition to your code. Please include two examples in your docstrings. You will need to paraphrase the full descriptions of the functions to get an appropriate docstring description. Function name: (Parameter types) -> Return type Full Description (paraphrase to get a proper docstring description) is_valid_tweet: (str) -> bool The parameter represents a potential tweet. The function should return True if and only if the tweet contains between 1 and MAX_TWEET_LENGTH characters, inclusive. compare_tweet_lengths: (str, str) -> int The two parameters represent valid tweets. This function must return one of three integers: 1 (if the first tweet is longer than the second), -1 (if the second tweet is longer than the first), or 0 (if the tweets have the same length). add_hashtag: (str, str) -> str The first parameter represents a valid tweet. The second parameter represents a valid tweet word. Appending a space, a hash symbol, and the tweet word to the end of the original tweet will result in a potential tweet. If the potential tweet is a valid tweet, the function should return the potential tweet. If the potential tweet is not a valid tweet, the function should return the original tweet. For example (assuming the hash symbol is ‘#’), if the first argument is ‘I like’ and the second argument is ‘cscA08’, then the function should return ‘I like #cscA08’, if MAX_TWEET_LENGTH is at least 14. Otherwise, it should return ‘I like’. The first parameter represents a valid tweet, and the second parameter represents a valid tweet word. This function should return True if and only if the tweet contains a hashtag made up contains_hashtag: (str, str) -> bool of the hash symbol and the tweet word. For example (assuming the hash symbol is ‘#’), if the first argument is ‘I like #cscA08’, and the second argument is ‘cscA08’, then the function should return True. Notes: If the first argument is ‘I like #cscA08’, and the second argument is ‘csc’, then the function should return False. Also, if the first argument is ‘I like #cscA08, #mat137, and #phl101’, and the second argument is cscA08, the function should return True. Hint: Use the helper function clean that is provided in the starter code. is_mentioned: (str, str) -> bool The first parameter represents a valid tweet, and the second parameter represents a valid tweet word. This function should return True if and only if the tweet contains a mention made up of the mention symbol and the tweet word. For example (assuming the mention symbol is ‘@’), if the first argument is ‘Go @Raptors!’, and the second argument is ‘Raptors’, then the function should return True. Hint: This function is very similar to the function contains_hashtag. What can you do to avoid writing the same code twice? add_mention_exclusive: (str, str) -> str The first parameter represents a valid tweet and the second parameter represents a valid tweet word. Appending a space, a mention symbol, and the tweet word to the end of the original tweet will result in a potential tweet. If the potential tweet is valid and the original tweet contains the given tweet word, the function should return the potential tweet. In all other cases, the function should return the original tweet. Note that if the tweet word is mentioned in the original tweet (i.e., it appears with a MENTION_SYMBOL as a first character), then the function should return the original tweet. For example (assuming the mention symbol is ‘@’), if the first argument is ‘Go Raptors!’ and the second argument is ‘Raptors’, then the function should return ‘Go Raptors! @Raptors’. If, on the other hand, the first argument is ‘Go @Raptors!’ and the second argument is ‘Raptors’, then the function should return the original tweet ‘Go @Raptors!’. Hint: Can you use one of your other functions as a helper function? The parameter represents a message. This function should return the minimum number of tweets that would be required Functions to write for A1 num_tweets_required: (str) -> int to communicate the entire message. Recall the maximum length of a tweet is MAX_TWEET_LENGTH. Hint: The ceil function in the math module is useful here. get_nth_tweet: (str, int) -> str The first parameter represents a message that a Twitter user would like to post, and the second parameter, n, represents an integer greater than or equal to 0. If the message contains too many characters, it would need to be split up into a sequence of tweets. All of the tweets in the sequence, except possibly the last tweet, would be of length MAX_TWEET_LENGTH. This function should return the nth valid tweet in the sequence of tweets. Note that the first tweet in the sequence has index 0, the second tweet in the sequence has index 1, and so on. If the value of the second parameter is too large, so there is no index-n tweet in the sequence, this function should return an empty string. Using Constants As we discuss in section Constants above, your code should make use of the provided constants. If the value of one of those constants were changed, and your program rerun, your functions should work with those new values. For example, if MAX_TWEET_LENGTH were changed, then your functions should work according to the new maximum tweet length. Your docstring examples should reflect the given values of the constants in the provided starter code, and do not need to change. No Input or Output Your tweet.py file should contain the starter code, plus the function definitions specified above. tweet.py must not include any calls to the print and input functions. Do not add any import statements. Also, do not include any function calls or other code outside of the function definitions. How should you test whether your code works First, run the checker and review ALL output — you may need to scroll. You should also test each function individually by writing code to verify your functions in the Python shell. For example, after defining function compare_tweet_lengths, you might call it from the shell (e.g., compare_tweet_lengths(‘I love’, ‘programming’)) to check whether it returns the right value (-1). One call usually isn’t enough to thoroughly test the function — for example, we should also test compare_tweet_lengths(‘programming’, ‘is fun’) where it should return 1 and compare_tweet_lengths(‘this course’, ‘is for me!!’) where it should return 0. A1 Checker We are providing a checker module a1_checker.py that tests two things: whether your code follows the Python style guidelines, and whether your functions are named correctly, have the correct number of parameters, and return the correct types. To run the checker, open a1_checker.py and run it. Note: the checker file should be in the same directory as your tweet.py, as provided in the starter code zip file. We have posted a demo of the checker being run and included it in the Week 3 Prepare exercises on PCRS. Be sure to scroll up to the top and read all messages. If the checker passes for both style and types: Your code follows the style guidelines. Your function names, number of parameters, and return types match the assignment specification. This does not mean that your code works correctly in all situations. We will run a different set of tests on your code once you hand it in, so be sure to thoroughly test your code yourself before submitting. If the checker fails, carefully read the message provided: It may have failed because your code did not follow the style guidelines. Review the error description(s) and fix the code style. Please see the PyTA documentation for more information about errors. It may have failed because: you are missing one or more functions, one or more of your functions is misnamed, one or more of your functions has the incorrect number or type of parameters, or one of more of your function return types does not match the assignment specification. Read the error message to identify the problematic function, review the function specification in the handout, and fix your code. Make sure the checker passes before submitting. Marking These are the aspects of your work that may be marked for A1: Coding style (20%): Make sure that you follow Python style guidelines that we have introduced and the Python coding conventions that we have been using throughout the semester. Although we don’t provide an exhaustive list of style rules, the checker tests for style are complete, so if your code passes the checker, then it will earn full marks for coding style with one exception: docstrings may be evaluated separately. For each occurrence of a PyTA error, one mark (out of 20) deduction will be applied. For example, if a C0301 (line-too-long) error occurs 3 times, then 3 marks will be deducted. All functions, including helper functions, should have complete docstrings including preconditions when you think they are necessary. Correctness (80%): Your functions should perform as specified. Correctness, as measured by our tests, will count for the largest single portion of your marks. Once your assignment is submitted, we will run additional tests not provided in the checker. Passing the checker does not mean that your code will earn full marks for correctness. No Remark Requests No remark requests will be accepted. A syntax error could result in a grade of 0 on the assignment. Before the deadline, you are responsible for running your code and the checker program to identify and resolve any errors that will prevent our tests from running. What to Hand In The very last thing you do before submitting should be to run the checker program one last time. Otherwise, you could make a small error in your final changes before submitting that causes your code to receive zero for correctness. Submit tweet.py on MarkUs by following the instructions on the course website. Remember that spelling of filenames, including case, counts: your file must be named exactly as above.