Stay Informed: How to Pull Your Own COVID-19 Data
March 26, 2020 No Comments Tech Hacks Jimmy Jones

For all the technology we have, it can still be frustratingly challenging to get any concrete information from the media. Often all you wish to do is to cut through the noise and see some genuine numbers. Watching talking heads argue for a half hour probably isn’t going to tell you much about how the COVID-19 infection is spreading through your regional community, but seeing real-time information pulled from numerous vetted sources might.

Having access to the raw information about COVID-19 cases, fatalities, and recoveries is, to put it mildly, mind-blowing. Even if everyday life appears basically unchanged in your corner of the world, seeing the rate at which these numbers are climbing up really puts the battle into perspective. You might be less inclined to go out for a leisurely walk if you understood how many brand-new cases had popped up in your neck of the woods throughout the last 24 hours.

However this short article isn’t about telling you how to feel about the information, it’s about how you can get your hands on it. What you do with it after that is entirely up to you. Depending upon where you live, the numbers you see might even make you feel a bit much better. It’s information by yourself terms, and in these unsure times, that might be the best we can wish for.

Scraping the CDC Website

If you remain in the United States, then the Centers for Disease Control and Avoidance is maybe the single most trustworthy source of COVID-19 data today. Regrettably, while the company provides a wealth of data through their Open Data APIs, it seems that things are moving a bit too quick for them to capture up at the moment. At the time of this writing there doesn’t seem an official API to pull from,

just a human-readable website. Of course if we can read it, than so can the computer. The site is basic enough that we can split out the variety of overall cases with absolutely nothing more than a few lines of Python, we do not even require to utilize an official web scraping library. It ought to be kept in mind that this isn’t a great concept under normal situations as changes to the site design could break it, but this(ideally)isn’t something we need to be preserving for very long. import requests # Download web page reaction=requests.get (‘https://www.cdc.gov/coronavirus/2019-ncov/cases-updates/cases-in-us.html’)# Step through each line of HTML for line in response.text.splitlines( ): # Look for cases line if” Total cases:”in line: # Divide out just the number print(line.split()[ 2] [: -5] Everything needs to be pretty simple to understand in that example except possibly the last line. Essentially it’s taking the string from the websites, splitting it up utilizing spaces as delimiters, and after that cutting the last 5 characters off

the back to get rid of the closing HTML tag. Certainly a hack, however that’s sort of what this website is all about. There are a couple important things you need to keep in mind when pulling information from the CDC like this. To start with, since the website is an essential source of information right now, do not hammer it. There’s truly no reason to strike the page more than one or two times a day. Second, even in a pandemic the CDC is obviously keeping typical business hours; the website states the stats will just be updated Monday through Friday.

Johns Hopkins Unofficial API

A much better choice, especially if you’re looking for worldwide data, is using the database maintained by the Johns Hopkins University Center for Systems Science and Engineering (JHU CSSE). This data is collected from several sources all over the globe and is being updated continuously. Whether you understood it at the time, there’s an outstanding opportunity you have actually currently seen their online control panel as it’s ended up being an indispensable referral to anybody tracking the progress of COVID-19.

This information is released to a main GitHub repository on a daily basis for anybody who wishes to clone it locally, however that’s not awfully practical for our purposes. Luckily, French data researcher

 Omar Laraqui has actually assembled a web API that we can utilize to easily survey the database without having to download the whole thing. His API provides a great deal of granularity, and allows you to do things like see the number of cases or healings there remain in particular provinces or states. You need to experiment around with the area codes a bit given that there does not seem a listing readily available, but once you have actually found the ID for where you wish to look it's easy to pull the most recent stats. import requests # Get data on only confirmed cases api_response=requests.get ('https://covid19api.herokuapp.com/confirmed')# Print most current information for location ID 100: California, USA print(api_response. json()[

'locations'] [100] ['most current '] The API likewise has a hassle-free endpoint at/ latest which will simply reveal you the worldwide overalls for active cases and deaths. Infection Tracker API Another alternative is the totally free API readily available from thevirustracker.com. I'm somewhat reluctant to recommend this service as it has all the hallmarks of somebody wanting to profit from a disaster and the website seems to decrease regularly. That said, it's easy to use and gives you a number of extremely convenient data points

that don't seem to be available from other sources. For instance it permits you to see how many of

 the cases are considered severe, as well as how many new cases have been included today. The API also includes a listing of recent news items connected to the country you have selected, which could be helpful if you're looking to make your own control panel. import demands # Request fails unless we offer a user-agent api_response=requests.get(' https://thevirustracker.com/free-api?countryTotal=US', headers="User-Agent":"Chrome") covid_stats=api_response. json()['countrydata'] # Break out private statistics print (" Total Cases:", covid_stats [0] [" total_cases "] print(" New Today:", covid_stats [0] ["total_new_cases_today"]

The COVID Tracking Project While it does not have the international information of the Johns Hopkins database, the COVID Tracking Project supplies one of the most feature abundant and well recorded API currently available. The task was begun by a team of reporters in early March, and is devoted to collecting accurate details not only from state and district health firms, however from a network of relied on regional reporters. Interestingly, they say the project would likely be suspended ought to the CDC roll out their own API with total state-level information.

There's an unbelievable wealth of details readily available through this API, including historical day-to-day data for individual states where available. The API is also distinct in that it will not only demonstrate how lots of people have actually checked favorable for COVID-19, but likewise those who were cleared of it. This enables you to see how much screening each state is actually doing, and is among the bottom lines of data the project wishes to see reported officially by the CDC.

 import requests # Get present information for New York, U.S.A. api_response = requests.get('https://covidtracking.com/api/states?state=NY') # Program favorable and negative test results print("COVID-19 Testing Lead To", api_response. json() [' state'] print("Favorable:", api_response. json() [' favorable'] print("Negative:", api_response. json() [' negative'] 

Due to the fact that their team thoroughly evaluates the data before pressing it out, the COVID Tracking Task might not get updated as rapidly as some other sources. However if you're trying to find the most precise details about what's happening on "on the ground" in the United States, this is certainly the API you desire.

Understanding is Power

After experimenting with these data sources for a bit, you're likely to observe that they do not constantly concur. Things are moving so rapidly that even when going directly to the source like this, there's a specific margin of error. An affordable technique might be to take several information sources and average them together, though that assumes you're able to drill down to the exact same level on each service.

As stated in the introduction of this post, what you do with this details is completely approximately you. For my own purposes I assemble a little network attached display so I can keep track of the total variety of cases in the United States, and truthfully it's been a sobering experience. Seeing the number increase by thousands every day has actually put the situation into focus for me; and I understand that by the time this short article is published, the number shown in the picture will be considerably lower than the current figures.

I can't say I'm particularly thankful to have the most recent numbers on my desk every morning, but I 'd rather know what we're up against than remain oblivious. Something tells me a number of you will feel the exact same way. If you're looking for less of a downer, you could constantly roll in some better information, maybe even revealing an animation whenever the variety of healings increases. Stay safe out there.

About The Author

Leave a reply

Your email address will not be published. Required fields are marked *

%d bloggers like this: