Web APIs and Plotly
References for this lecture
Requests module user’s guide: https://docs.python-requests.org/en/latest/
Plotly graphing library: https://plotly.com/python/
Exploring data from the COVID API
Last time, we got the following code working.
- if you don’t yet have the
requestsmodule installed, see the notes from last time on how to do it (and ask for help if you run into any problems!) - run this again and make sure you’re able to print some data
import requests
response = requests.get("https://api.covid19api.com/live/country/united-states")
data = response.json()
print(data)
Group Activity Problem 1
Explore the data variable you got back.
Answer the following questions about it:
- What is the format of this data?
- How many items did you get back?
- What do you think this data represents?
Group Activity Problem 2
Write the line of code that would print the 50th item in the list.
Group Activity Problem 3
Write the line of code that prints the state/province/region represented by the 50th item.
Group Activity Problem 4
Write the line of code that prints the state/province/region represented by the 50th item along with the date and the number of active COVID cases.
Group Activity Problem 5
Write a loop that will print the state/provice/region, date, and active cases for all items.
Group Activity Problem 6
Add an if statement to your loop that will make it print the information only if it is for Iowa.
Group Activity Problem 7
Now we’re going to try accessing some different data from the same Web API service. Notice that the code below is the same, but it uses a different web address - these different web addresses are called endpoints of the API.
import requests
response = requests.get("https://api.covid19api.com/summary")
data = response.json()
print(data)
Discuss the format of this data - it’s not the same as with the other endpoint. This is an example where it’s not just a list of dictionaries like we’ve seen before. What is the type of the outer-most thing (data)? How many countries are represented? Write the answers in your notes.
Note that you can find more endpoints here: https://documenter.getpostman.com/view/10808728/SzS8rjbc
Group Activity Problem 8
Write the code that will use the https://api.covid19api.com/summary endpoint to display the number of new deaths from COVID in the United States of America.
Plotly: A Visualization Library
Now we’re going to use another new Python package: plotly. Plotly is a really neat package for making interactive visualizations from data.
We need to install two new packages to make this work: pandas and plotly:
For these commands - replace python3 with the path to the python executable on your computer (ask for help if you forgot how to do this!)
python3 -m pip install pandas
python3 -m pip install plotly
import plotly.express as px
import requests
response = requests.get("https://api.covid19api.com/summary")
data = response.json()
country_data = data["Countries"]
fig = px.bar(country_data,x="Country",y="NewConfirmed",title="New Confirmed Cases by Country")
fig.show()
Processing the data to make a better visualization
That chart had too many data points - we can’t really see what’s going on!
Let’s loop through the list of countries and only keep the entries that have more than 10000 cases
import plotly.express as px
import requests
response = requests.get("https://api.covid19api.com/summary")
data = response.json()
display_data = [] #this is where we'll put the records we want to display
for current_country in data["Countries"]:
if current_country["NewConfirmed"] >= 10000:
display_data.append(current_country)
fig = px.bar(display_data,x="Country",y="NewConfirmed",title="New Confirmed Cases by Country")
fig.show()
Group Activity Problem 9
Try this code to make a plot from the https://api.covid19api.com/live/country/united-states endpoint.
Discuss the following questions
- what is the difference between
px.barandpx.line? - why do we plot
datahere instead ofdata["Countries"]like in the previous example?
import plotly.express as px
import requests
response = requests.get("https://api.covid19api.com/live/country/united-states")
data = response.json()
data = data[2:] #the first two data points are erroneous, so you can remove them
fig = px.line(data, x="Date", y="Deaths", color="Province", title='COVID Deaths in the US')
fig.show()
Group Activity Problem 10
Try writing a loop like we did with the country-level data to filter out all but Iowa.
Can you make it so that it works with just two or three states?