This blog post was authored by Shish Shridhar, Director of Business Development – Retail, Microsoft
By combining demographic data like Median Income, Education Levels, Median Age and customer purchasing data such as preferences, past purchases, and online behavioral data, retailers gain a more in-depth understanding of customer needs and wants than with just past purchase data. Power BI provides powerful capabilities for combining data from various sources and enabling visual correlation: you can certainly use the Data Analysis Tookpak in Excel and run a Correlation Coefficient on the combined data as well.
To test this out, I came with the assumption that Seattle has the most Starbucks and that demographics affect the number of stores. To learn what the answer was, I used Power BI and Excel. Here is what I did:
I looked for oData sources relevant to retail and found one via Socrata: https://opendata.socrata.com/
I ran a search for Starbucks, to see if there was a list of Starbucks Restaurants from around the world. I did find a oData source of with a listing of all the Starbucks Restaurants around the world: https://opendata.socrata.com/Business/All-Starbucks-Locations-in-the-World/xy4y-c4mk
Here is the oData source link to access the data: https://opendata.socrata.com/OData.svc/xy4y-c4mk
Using Power Query for Excel, I was able to access the data using the oData option. This returns 20, 621 rows of data containing details of Starbucks restaurants around the world:
To get a better insights from the data, I used Power View for Excel to create a visualizations. A quick drag & drop of the Brands against the count of the StoreIds showed me the brands represented by the data:
I was curious about the countries with the most Starbucks, so I dragged in the Country information along with a count of the StoreIds. Here is the result:
And interestingly, Seattle is not the city with the most Starbucks, as I’d assumed:
Power Map for Excel enables visualizing this data on a Map as a layer of information:
I was able to obtain US Census Data from Neustar and I imported this data into Excel. This data included Zip codes as well as detailed information about every Zip code. I could potentially use this information to correlate things like median age, median income, population around each of the Starbucks in the US. The Data looks like this:
When I overlay the Census Data on top of the Starbucks Store Locations, I get a visual correlation between demographics data and Starbucks locations:
Here is a Power Map for Excel video of two layers: Starbucks store locations with Median Income by Zip code:
There are several sources of interesting public data that you can use to analyze Retailers: proximity analysis of retailers and their competition using data from Yelp and Foursquare; Correlating retail yelp rating and FourSquare CheckIns against Demographic data; Correlating Weather data against store performance.
Here’s the actual live visualization I created with Power View:
You can check out some more examples at my blog.