Python Jupyter notebook Assignment

Order Description

INFO 5717— Networked Data Modeling and Processing 1 Assignment three This assignment has four questions. Submit your assignment by following the steps listed below: (1) Create a New Word Document, name it “_Assign3.doc”; (2) Record screenshots of the output of your codes in the Word Document; (3) Create a new Jupyter Notebook, name it “_assognment-three-code.ipynb”; (4) Write your codes for question one to four in the “_assognment-threecode.ipynb” file, use a text chunk or comment to separate each question and subquestion, see the following screenshot as an example; (5) Upload and commit your “_assognment-three-code.ipynb” file to your github repo info5717. Question 1. (30 points). Given a variable text below, write a Python program to find answers to following sub- questions. You may use regular expression or the text processing functions from Lesson 4. DO NOT use the features in the NLTK package or other Linguistics package, but write your own code to do it. Text = "Harry lay in his dark cupboard much later, wishing he had a watch. He didn't know what time it was and he couldn't be sure the Dursleys were asleep yet. Until they were, he couldn't risk sneaking to the kitchen for some food. He'd lived with the Dursleys almost ten years, ten miserable years, as long as he could remember, ever since he'd been a baby and his parents had died in that car crash. He couldn't remember being in the car when his parents had died. Sometimes, when he strained his memory during long hours in his cupboard, he came up with a strange vision: a blinding flash of green light and a burning pain on his forehead. This, he supposed, was the crash, though he couldn't imagine where all the green light came from. He couldn't remember his parents at all. His aunt and uncle never spoke about them, and of course he was forbidden to ask questions. There were no photographs of them in the house. " (1) (10 points) Calculate the number of words that start with a vowel (a, e, i, o, and u). Print out a list of these words, along with their frequency (number of times they appear in the text). The list should be in decreasing order of frequency. (2) (10 points) Print out words that have an apostrophe (‘) in the middle of the word. Examples include English contractions such as “didn’t”, “wasn’t”, “couldn’t", or "she'd". (3) (10 points) Remove all stopwords, and words with an apostrophe in the text and print it out. You can find a list of English stopwords here: http://www.ranks.nl/stopwords. Question 2. (40 points). I am interested in knowing how the climate changes in terms of temperature and precipitation. The U.S. climate data site contains climate data for Denton, Texas since 2009. I would like you to do some calculation and comparison between the data in 2010 and 2017 in order to answer the following two research questions: RQ1: Is 2017 significantly different from 2010 on temperature and precipitation in the months January-June? RQ2: Is June 2017 significantly hotter than June 2010?INFO 5717— Networked Data Modeling and Processing 2 (1) (6 points). Create two files based on data published at U.S. climate data (http://www.usclimatedata.com/climate/denton/texas/united-states/ustx0353): ➢ File A (should be called 2010-Jan-June.txt, or 2010-Jan-June.csv) contains daily weather data from January 1, 2010 to June 30, 2010; ➢ File B (called 2017-Jan-June.txt, or 2017-Jan-June.csv) contains daily weather data from January 1, 2017 to June 30, 2017; To find the data, go to “History” tab of the above page, select the right year and month. You will see the data being presented to you. The final format of each result file should look like the following: , , , 1-Jan,55,33,0.08 2-Jan,55,33,0.12 …… 1-June,80,56,0,15 … The delimiter can be comma (,) or whitespace. Make sure you round the numbers for the temperature so there is no decimal points. (2) (12 points). Write a program to calculate the mean, median, and standard deviation of high temperature, low temperature and precipitation of each file, and output the results in the following format: File name mean median standard deviation 2010-Jan-June.txt ----- ----- ----- 2017-Jan-June.txt (3) (16 points). In order to answer the first research question, we would like to conduct some statistical tests. Take File A and File B, conduct a T-test on TWO RELATED samples (you can use scipy.stats.ttest_rel: http://docs.scipy.org/doc/scipy- 0.14.0/reference/generated/scipy.stats.ttest_rel.html) on Temperature High, Temperature Low, and Precipitation to find out whether there is a significant difference between these scores. Report your results in statements after the program using the docstring. (4) (6 points). Describe how you can answer research question 2 and what your answer is. Question 3. (30 points). We have discussed how we could create a Python class, and how to define the properties, functions, and data types it would require. For this question, you are required to create and implement the class Student in the “_assognment-threecode.ipynb” file. Your class should be able to do the following:INFO 5717— Networked Data Modeling and Processing 3 (1) (6 pts) Every instance of the Student class should have the following attributes: ➢ First Name ➢ Last Name ➢ Middle Name ➢ euid, by default euid is blank. ➢ GPA, by default the GPA is set to 0 ➢ Classes taken. This object should store the class number (such as INFO 5717), the semester it was taken it (such as "Fall 2019"), and the grade the student obtained. HINT: use a collection data type. (2) (6 pts) Write a program in the “_assognment-two-code.ipynb” file that asks the user to enter information for three fictional students. The program should prompt the user to ask for the student's first, middle, and last names. This information should be stored in an object of type Student. (3) (6 pts) Write a function for the Student class that assigns each student a euid. The euid should be composed of three letters (the first three letters of the student's first, middle, and last names) and then 4 digits chosen randomly. For example, if a student's name is Jennifer Lynn Meyers her euid could be "jlm8940". Use this function to print out the names and euids of the students you created in (2). (4) (6 pts) Write a function for the Student class called "register". This function asks the user to enter a class number, a semester, and a grade for each student. Then this information is added to the Student object. Call this function for the students you created in part (2). Each student should be registered for at least two courses but can have more than. (5) (6 pts) Write a function for the Student class that calculates a student's GPA and prints it out. Use it to print out the names and GPAs of the student's you created in (2). Extra credits Question 4. (20 points). Create a quiz by using python 3 to implement a set of functions. Your are required to write your code inside the function defined blew: (1) (3 points). def quiz_introduction(): # in this function, display the quiz in the format showing in the following picture (2) (3 points). def quiz_questions():INFO 5717— Networked Data Modeling and Processing 4 # in this function, display the questions that a user can choose to answer, the user can also choose exit. The display is showing in the following picture def section_separator(): # print “------------------------” to separate each section print("-" * 24) (3) (3 points). def get_user_question_choice(): # in this function, get user input for choosing which question the user want to answer, return the question index so that the code blew know which question should be displayed. def get_user_input_solution(problem): # in this function, get user input answer for the solution check process print("Enter your answer") print(problem, end="") result = int(input(" = ")) return result (4) (3 points). def solution_checking(user_solution, solution, count): # this function should be used for checking whether the user solution is correct or not # the variable user_solution means the solution input by the user, the variable solution means the correct solution, the variable count means the count of correct answer that the user gets. (5) (5 points). quiz_questions(index, count): # define four different math questions here: addition, subtraction, multiplication, integer division. # index means the question index, count means the count of correct answer that the user got. (6) (3 points). def show_result(total, correct): # in the function, show the result as displaying in the following pictureINFO 5717— Networked Data Modeling and Processing 5 # total means the total questions that a user answered, correct means the the count of correct answer that the user got. def main(): quiz_introduction() quiz_questions() section_separator() option = get_user_question_choice() total = 0 correct = 0 while option != 5: total = total + 1 correct = quiz_questions(option, count) option = get_user_question_choice() print("Exit the quiz.") section_separator() show_result(total, correct) main() The following picture shows a sample output:INFO 5717— Networked Data Modeling and Processing 6