Each year, Gartner invites vendors from Magic Quadrant for Analytics and Business Intelligence platforms to present at the Gartner Data & Analytics Summit and show how their product can help solve real-world problems and find insights in real-world data. In previous years we’ve looked into the opioid crisis in the USA, global happiness, and life expectancy. This year the impact of COVID-19 was the obvious candidate! There is a wealth of data available on the internet, and while Gartner gave us some direction for the business questions we would investigate, we could use data from any number of sources to do the analysis.
This video shows the final demos we produced for the bakeoff. There are 5 three- or four-minute demos covering various aspects of Power BI. It starts on Section 2 because Section 1 was introductions – and you already know what Power BI is!
In the below video and blog post Will Thompson shows and explains what we’ve done for this bake-off!
Data preparation
I started by loading in the data. Gartner had provided links to a variety of sources such as Our World in Data. There’s a particularly good resource here in the CSV file they provide at https://covid.ourworldindata.org/data/owid-covid-data.csv – it gets updated daily, and you can access it directly from Power BI Desktop. Get Data with the Web connector and paste the URL in:
Power BI will handle the rest, you can just click load to pull in the table:
Learn more about the web connector here.
Depending on the file you’re loading you might want to adjust the ‘Encoding’ value if accented characters are not shown properly e.g.:
You can do this by selecting the settings icon at the right of the Source step in your query and selecting from the dropdown: (or by editing the M script in the editor). Learn more about CSV encoding here
One of the main advantages of connecting directly to a CSV file hosted on GitHub or other web locations is that it can be refreshed automatically. Once you’ve built your report, clicking Refresh in the Power BI Desktop ribbon retrieves any new data that’s been included in the CSV file. If you publish your PBIX file to the Power BI service, you can even configure scheduled refresh to retrieve the data automatically.
I made one choice in loading the data to smooth out some spikes in the number of reported cases which might skew the results. Some states reported significantly higher numbers of cases on certain days – for example New Jersey reported 2158 cases on 25th June vs less than 400 on the other days that week. I took an average number and simply replaced those values, which might not be the most statistically correct way to approach cleaning this data, but it was a quick and easy way to avoid outliers changing the high-level analysis we were doing.
Data modeling
As is typical for the bakeoff, the variety of datasets coming from different locations means there’s rarely a neat, clean, star schema to adhere too. And in fact for the most part it’s not necessary – we’re focused here on finding interesting insights and telling a good story, not optimizing for large datasets or high performance. For the most part I left the tables in the same shape as they came in from the data sources, with some small pivoting to arrange data in a more ‘columnar’ format. I did separate out date, country, and state tables to provide dimension tables that could be used to relate the various data. This is probably the part of the solution I’d least take as a representation of best practices – a star schema is usually the best option.
I made a couple of calculated tables to restructure the data about government mandates and restrictions. I wanted to group countries by the number of days where restrictions such as mask-wearing requirements, travel restrictions, etc. were in place. The DAX calculation summarized each column, and a feature of the DAX editor was helpful – you can ALT-click in the editor window to add multiple cursors. If your DAX expressions have repeating lines like this it can save some manual repetitive edits! There are other keyboard shortcuts that are worth checking out to save you time while writing DAX expressions.
Presenting insights in the data
There’s a whole raft of data out there about the vaccination rollout, COVID cases and deaths etc. and finding patterns in the data is a very sensitive topic. Every data point represents humans who got sick or died because of the pandemic, and countless friends and family members who’ve lost loved ones. Any of the ‘insights’ we find in preparation for the bakeoff have to be taken ‘with a pinch of salt’ – I’m no epidemiologist and there are so many factors at play that might not be included in the data. To try and keep a more positive spin on this, I focused on the vaccination rollout to begin with.
Plotting out the cases and vaccinations per million at a country level gave a good starting point, but you can see from this that my initial visuals weren’t exactly engaging:
I called on Chris Hamill of the Power BI Customer Advisory Team to help make the report look more alluring – after all that’s the name of his blog. It was also his idea to use the great ImageGrid visual for the flags on the later pages of the report; kudos to Fredrik Hedenstrom for producing that visual!
Chris also set up the bookmarks for navigation between the pages in the report. If I was to reproduce the report I’d take advantage of the upcoming page navigator visual that we demoed at MBAS, it would have simplified setting up and maintaining those bookmarks!
On the subject of bookmarks, the ‘tabs’ on the US Vaccinations page combine to sets of bookmarks – the All States / Warmer / Cooler States update the filters applied to the other visuals, while the Comparison swaps the visible chart visual from the small multiple with many states to the aggregated version showing just two. This shows how flexible bookmarks can be to combine different types of visuals to tell your story.
For the small multiple visuals, I experimented with different ways to sort the individual bar charts for each state. One thing I wanted to do was sort them so the states with the highest number of cases overall appeared at the top of the list. I used the Sort By option to tell Power BI to sort the states by a different field – but allow for other visuals to continue sorting states alphabetically I made a second field “States (by cases)” and set the Sort By order just on that. I also made a custom column that sorted a few states of interest to the top of the list – you can see it in the model, called “State sorting” – with a simple DAX formula:
When building the animated bubble chart, I needed to show the data over time. I tried putting the date field into the Play Axis bucket, but with hundreds of days to play, it took a long time to run the animation. Grouping at the week level made more sense – I used the WEEKDAY() DAX function to create a calculated column that held the date for the start of the week:
And added that to the Play Axis. By default, the date formatting included the redundant Year part, so I set a custom format string to just show the day and month. I didn’t want to just create a string column as the sorting would then need to be edited as well.
In the second demo section, you’ll see me quickly add two visuals using Q&A. To make sure that I got meaningful visuals straight away, I hid most of the fields from the model. There were lots of columns that talked about ‘cases’ (e.g. confirmed cases, verified cases, cases per million, per state/per country) and hiding fields is a quick way to tailor the visuals Q&A provides in response to a given question. You can also use the Q&A Setup dialog to explicitly manage the synonyms for any given field and whether or not it is included in the Q&A schema.
Sharing the data
The great thing about our Teams integration was how it’s so easy to configure – it took very little time to set up my demo! To add a Power BI tab to a Teams channel click the + button in the Channel header
Select the Power BI app
And then find the report you want to pin into the channel.
Recently announced features
You also saw in the ‘Sharing and collaboration’ section a demo of Power BI Goals. This is a new feature that lets you create scorecards to track your key business metrics, assign them to your colleagues to help drive accountability, and attach notes that help provide contextual explanations about the progress towards those goals.
You can learn more about Power BI Goals in our announcement blog post
In the ‘differentiator’ section I showed how we’re bringing Power BI reports to the physical world with our ‘Data in Place’ feature in Power BI Mobile. This uses the power of Azure Spatial Anchors to tag reports into a physical location, using your phone’s AR capabilities to position reports in the context of your where your business processes are happening – in this case as we’re checking in patients the vaccination center in downtown Seattle.
This capability is coming soon, and you can see more about it and how we’re bringing BI and AR together with Hololens in our MBAS session
Demo file
Thanks for reading! If you’d like to download the PBIX used for the demo, you can find it on GitHub.