Final Project

2 minute read

Due: by the end of the calendar day on Wednesday, May 3, 2023

Assignment Requirements

For the final project, I would like you to extend the web-scraping assignment that you previously completed. You complete the following items

  1. a program that does something interesting using data scraped from one or more web sites
  2. at least one custom class that you create for storing your data
  3. a document which covers the following:
    • an description of what your program does
    • a cut-and-paste transcript or screenshots showing a sample run of your program
    • a discussion of the data structure(s) you used to solve your program - this should include
      • a list of the methods your program uses
      • what each of their Big O time complexities are
      • a defense of why this was the best data structure to use in this case

Some Ideas

Here are some examples of projects you could choose

  1. a program that finds all Wikipedia articles that are within $n$ clicks of some starting article
  2. a program that can tell how many clicks it takes to get from one Wikipedia article to another (you may limit it to just the first 10 or 20 links just to make sure that it can finish in a reasonable time)
  3. a program that collects statistics from a sports reference page (e.g., FanGraphs, Ultimate Tennis Statistics) and presents some kind of useful information (like averages, max/min, etc.)

Groups

You can work individually or in groups of 2-3 students. Only one student needs to turn in the project from each group.

What to turn in

Turn in the following to the Final Project Submission Form on Blackboard

  • any .py files with your project code
  • a document with all of the items listed above (in doc, docx, or pdf form - or a link to an online document like Google Doc)

Grading

This project is worth 10% of your course grade. It will be graded on the following criteria:

  • 2 points - the students have used web scraping to get data from a web page
  • 2 points - the students have created a custom class for the data with good programming practices
  • 2 points - the document includes a clear description of the program and sample run
  • 2 points - the students have performed correct Big O analysis on each of the data-centric tasks that their program does
  • 2 points - the students have made a correct and convincing argument that their selected data structure is the best for their program