Web APIs and Plotly¶

CS 66: Introduction to Computer Science II¶

References for this lecture¶

Requests module user's guide: https://docs.python-requests.org/en/latest/

Plotly graphing library: https://plotly.com/python/

Exploring data from the COVID API¶

Last time, we got the following code working.

  • if you don't yet have the requests module installed, see the notes from last time on how to do it (and ask for help if you run into any problems!)
  • run this again and make sure you're able to print some data
In [ ]:
import requests

response = requests.get("https://api.covid19api.com/live/country/united-states")

data = response.json()
print(data)

Group Activity Problem 1¶

Explore the data variable you got back.

Answer the following questions about it:

  • What is the format of this data?
  • How many items did you get back?
  • What do you think this data represents?

Group Activity Problem 2¶

Write the line of code that would print the 50th item in the list.

Group Activity Problem 3¶

Write the line of code that prints the state/province/region represented by the 50th item.

Group Activity Problem 4¶

Write the line of code that prints the state/province/region represented by the 50th item along with the date and the number of active COVID cases.

Group Activity Problem 5¶

Write a loop that will print the state/provice/region, date, and active cases for all items.

Group Activity Problem 6¶

Add an if statement to your loop that will make it print the information only if it is for Iowa.

Group Activity Problem 7¶

Now we're going to try accessing some different data from the same Web API service. Notice that the code below is the same, but it uses a different web address - these different web addresses are called endpoints of the API.

In [ ]:
import requests

response = requests.get("https://api.covid19api.com/summary")

data = response.json()
print(data)

Discuss the format of this data - it's not the same as with the other endpoint. This is an example where it's not just a list of dictionaries like we've seen before. What is the type of the outer-most thing (data)? How many countries are represented? Write the answers in your notes.

Note that you can find more endpoints here: https://documenter.getpostman.com/view/10808728/SzS8rjbc

Group Activity Problem 8¶

Write the code that will use the https://api.covid19api.com/summary endpoint to display the number of new deaths from COVID in the United States of America.

Plotly: A Visualization Library¶

Now we're going to use another new Python package: plotly. Plotly is a really neat package for making interactive visualizations from data.

We need to install two new packages to make this work: pandas and plotly:

For these commands - replace python3 with the path to the python executable on your computer (ask for help if you forgot how to do this!)

python3 -m pip install pandas
python3 -m pip install plotly
In [ ]:
import plotly.express as px
import requests


response = requests.get("https://api.covid19api.com/summary")

data = response.json()
country_data = data["Countries"]

fig = px.bar(country_data,x="Country",y="NewConfirmed",title="New Confirmed Cases by Country")
fig.show()

Processing the data to make a better visualization¶

That chart had too many data points - we can't really see what's going on!

Let's loop through the list of countries and only keep the entries that have more than 10000 cases

In [ ]:
import plotly.express as px
import requests


response = requests.get("https://api.covid19api.com/summary")

data = response.json()

display_data = [] #this is where we'll put the records we want to display


for current_country in data["Countries"]:
    if current_country["NewConfirmed"] >= 10000:
        display_data.append(current_country)


fig = px.bar(display_data,x="Country",y="NewConfirmed",title="New Confirmed Cases by Country")
fig.show()

Group Activity Problem 9¶

Try this code to make a plot from the https://api.covid19api.com/live/country/united-states endpoint.

Discuss the following questions

  • what is the difference between px.bar and px.line?
  • why do we plot data here instead of data["Countries"] like in the previous example?
In [ ]:
import plotly.express as px
import requests

response = requests.get("https://api.covid19api.com/live/country/united-states")

data = response.json()
data = data[2:] #the first two data points are erroneous, so you can remove them

fig = px.line(data, x="Date", y="Deaths", color="Province", title='COVID Deaths in the US')
fig.show()

Group Activity Problem 10¶

Try writing a loop like we did with the country-level data to filter out all but Iowa.

Can you make it so that it works with just two or three states?