Let's Keep Things Flat: Visualizing Hope for a Gradual, Successful, Re-Opening from COVID-19

Let's Keep Things Flat: Visualizing Hope for a Gradual, Successful, Re-Opening from COVID-19

Written by Christian Felix, 18.Apr.2020, with contributions from Anna Foard @stats_ninja

Over the last few months, COVID-19 numbers have taken the national and global spotlight. Throughout this time we’ve all been hoping to, and increasingly, working towards ‘flattening the curve’ in our communities. The events of the last few days, tell us that to varying extents, this has been working. This past Thursday the White House released new guidance to help state and local officials navigate reopening their economies, and countries in Europe are also starting to re-open their societies where circumstances allow. 

Although metrics like the overall number of cases, and the number of deaths are often easier for the average person to consume and understand compared to growth rates, doubling times, or counts per thousand. The effectiveness of the measures that have radically re-shaped our lives, will ultimately come to be measured not only in raw counts but also in increments of change and growth over time.

Going back to the curve – what is it? 

the curve.jpg

The concept of “flatten the curve” supposes that a certain number of people will become infected with COVID-19. The total is unknown, but by social distancing and staying home, the number of people infected at once will spread out over time. Because COVID-19 can be so deadly and require hospitalization, the goal is to keep hospitals working below capacity, ultimately avoiding additional deaths for those who will otherwise still need care.

Through the challenges of this pandemic, there has been some incredible work by the data journalism community to help the public understand the importance of flattening the curve, and whether or not we are succeeding in our efforts to indeed flatten it. 

John Burn-Murdoch’s log scale chart from the FT will likely be remembered by those of us in the data visualization community long after the pandemic has passed

John Burn-Murdoch’s log scale chart from the FT will likely be remembered by those of us in the data visualization community long after the pandemic has passed

Visualizing the Objective

The questions that we have wrestled with over the weeks is this; How do we know that the curve has become sufficiently flat? What data-driven thresholds are we looking for to know we have reached the point where things can gradually start to be re-opened?

One of the more compelling and helpful pieces to address this question has come from the American Enterprise Institute, and a team led by former FDA Commissioner, Dr. Scott Gottlieb. Their roadmap to re-opening which was released on March 29th contains a 4 Phased approach with specific thresholds for action to move from Phase to Phase. One of those specific thresholds is when a state reports a 'sustained reduction in cases for at least 14 days (i.e. one incubation period).'

Image is taken from page 3 of the National Coronavirus Response (Gottlieb, Rivers, McClellan, Silvis, and Waton )

Image is taken from page 3 of the National Coronavirus Response (Gottlieb, Rivers, McClellan, Silvis, and Waton )

This aligns with the guidance released by the Trump administration last Thursday which states a state or a region should be experiencing a 'downward trajectory of documented cases within 14 days' before proceeding to a phased comeback.

This, of course, is not the only target. 

Local hospital capacity, availability of testing, and the capability to effectively monitor confirmed cases and trace their contacts are also vitally important factors. But for those of us working with the publicly available case data from JHU or the NYTimes, and looking to provide data visualization resources and tools as a public service, one of the key objectives at this point should be to move beyond reporting counts, and also clearly convey whether or not a locality is experiencing a 'downward trajectory' or a 'sustained reduction' of cases on a consistent basis over a two week period.

Untangling Terminology

There are various ways to do this. How have we done it?

1) Comparing rates of change to 14 day moving average rate of change allows us to better understand if cumulative case counts are experiencing a downward trajectory or an upward trajectory.

Consider the example of Orleans County, Louisiana, where rates of confirmed cases increased steadily through March and into early April:

The spread of the outbreak in Orleans County, was significantly brought under control around the second week of April. Rates of confirmed cases have been increasing, but at a decreasing level ever since.

The objective is to calculate these trends at the county level, and then encode and visualize them in such a way that it is easy for the consumer to understand; for their own county and for other counties in their area. Consider the calculation detail for Orleans County, Louisiana shown below:

Where Column A is more than Column B, the day is assigned a status of ‘INCREASE’, where it is lower, it is assigned a status of ‘DECREASE’. Column D, looks at the pattern in column C, assigns a category, and encodes it with an orange and blue color blind friendly, diverging scale.

Looking at the chart once again for Orleans County, Louisiana with the color-coding applied is helpful:

After Increasing at an Increasing rate for 29 consecutive days, the case count rate in the county has been Increasing at a Decreasing rate for 10 consecutive days as of 18.Apr

Ultimately, what we end up with is a categorization for each day, for each county representing a rate of change that is either ‘Increasing at an Increasing rate’ or ‘Increasing at a Decreasing rate’.

Visualizing these categories over a 30 day period using strip plots allows the viewer to quickly see the trend status across many different counties and adds tremendous scalability to the visualization

Total Confirmed Cases by county and strip plots of consecutive day trends over a 30 day period ending on 18.Apr for the top 12 counties by case count

Visualizing the results over time or across points in time becomes particularly compelling as it allows us to see how counties are ascending the curve or descending the curve; increasing at an increasing rate (orange) or increasing at a decreasing rate (blue):

Top 12 Counties by Case Count on April 18th and April 11th. There is significantly more blue in the strip plots on April 18th. This is a good sign.

An additional piece of important information is also encoded in the strip plots; the counties ‘Peak Date’ (indicated by the black bar ‘ | ’ ). This represents the day that experienced the largest case count increase for each county. For many counties in the image above, the peak date is a week or two behind them.

Fooled by Geography

Maps have played a prominent role in visualizing the spread of COVID19, and rightfully so, they allow us to quickly and intuitively understand the content of the data and associate it with a geographic location:

https://coronavirus.jhu.edu/us-map

https://coronavirus.jhu.edu/us-map

But they do have their flaws that need to be addressed. For one, in instances where data exists in Hawaii and Alaska, a US albers projection map should be used, or some other means of including those states (and possibly Puerto Rico) into the analysis. Tools like mapshaper.org or the development seed dirty reprojector app make this easy enough to accomplish and should be a consideration for the dataviz designer looking to comprehensively convey the data.

Secondly, the county level chloropleth tends to elevate the importance of the square mileage of the county at the expense of the number of cases being measured. Using the JHU map shown above as an example, our perception is quickly drawn to Arizona and California where it should instead be focused on New York and New Jersey.

Our county-level chloropleth succumbs to this as well: The map below shows rate increases or decreases from March 20th to April 18th:

The map is turning considerably bluer. This is a good thing. However, many of the dark blue counties are counties that have not experienced significant case counts (comparatively). The hardest-hit counties in New York, New Jersey, Illinois and Pennsylvania are still a much lighter hue of blue or even remain orange.

In the end, every chart is a bit of a compromise and even the best chart can be complemented by visualizing the data in other ways. We’ve done this by adding in additional metrics that provide insight at the county level into trends

The Strip Plots and Spark Lines were added to complement and correct some of the deficiencies inherent in the chloropleth map

The sparklines show the doubling time for each county. I’ve calculated doubling time in days using the following formula = (x*ln(2))/ln(y/z)), where:

Day 0 = the day the county first surpassed 10 cases

x = the number of days that have passed since Day 0

y = the number of cases on Day x

z = the number of cases on Day 0

Consider the following example using King County, Washington:

double time example v3.png

Using this methodology we have calculated the doubling time in days for each county and charted the 30-day trends, including the doubling time in days on the last day in the period, and the date and value associated with the lowest doubling time.

What does it all mean?

The results are hopeful.

For most counties, doubling times are increasing (more days to double) over the 30 day window, and dates associated with the lowest doubling times are a couple weeks in the past.

Once again, the case trajectory is just one piece of the puzzle. The other critical factors (availability of tests, hospital capacity, ability to track cases and case contacts) are not incorporated into this analysis and are important. That said, for those of us who have lost loved ones to this virus, or have lost jobs, or have lost our sanity in quarantine, this should prove to be good news. It shows us that in many places, things on the ground are improving and that whatever a ‘post-COVID’ return to normal looks like, the data seems to warrant a prudent, and gradual move towards it.

None of this means that we are out of the woods yet. There is always the possibility of a resurgence or another outbreak. The hope is that elected officials and others who are tasked to determine whether or not to re-open will find this tool or others like it useful to better understand not just case and death counts in their area, but also how things are changing over time.

Click on the image to access the data visualization on tableau public

Click on the image to access the data visualization on tableau public

Fulfilled: A Collaboration with Kevin Flerlage

Fulfilled: A Collaboration with Kevin Flerlage

Visualizing COVID19 Cases in the United States

Visualizing COVID19 Cases in the United States