This blog post was authored by Shish Shridhar, Director of Business Development - Retail, Microsoft

By combining demographic data like Median Income, Education Levels, Median Age and customer purchasing data such as preferences, past purchases, and online behavioral data, retailers gain a more in-depth understanding of customer needs and wants than with just past purchase data. Power BI provides powerful capabilities for combining data from various sources and enabling visual correlation: you can certainly use the Data Analysis Tookpak in Excel and run a Correlation Coefficient on the combined data as well.

To test this out, I came with the assumption that Seattle has the most Starbucks and that demographics affect the number of stores. To learn what the answer was, I used Power BI and Excel. Here is what I did:

I looked for oData sources relevant to retail and found one via Socrata: https://opendata.socrata.com/

 

I ran a search for Starbucks, to see if there was a list of Starbucks Restaurants from around the world. I did find a oData source of with a listing of all the Starbucks Restaurants around the world: https://opendata.socrata.com/Business/All-Starbucks-Locations-in-the-World/xy4y-c4mk

 

Here is the oData source link to access the data: http://opendata.socrata.com/OData.svc/xy4y-c4mk

Using Power Query for Excel, I was able to access the data using the oData option. This returns 20, 621 rows of data containing details of Starbucks restaurants around the world:

 

To get a better insights from the data, I used Power View for Excel to create a visualizations. A quick drag & drop of the Brands against the count of the StoreIds showed me the brands represented by the data:

 

I was curious about the countries with the most Starbucks, so I dragged in the Country information along with a count of the StoreIds. Here is the result:

 

And interestingly, Seattle is not the city with the most Starbucks, as I’d assumed:

 

Power Map for Excel enables visualizing this data on a Map as a layer of information:

I was able to obtain US Census Data from Neustar and I imported this data into Excel. This data included Zip codes as well as detailed information about every Zip code. I could potentially use this information to correlate things like median age, median income, population around each of the Starbucks in the US. The Data looks like this:

 

When I overlay the Census Data on top of the Starbucks Store Locations, I get a visual correlation between demographics data and Starbucks locations:

 

Here is a Power Map for Excel video of two layers: Starbucks store locations with Median Income by Zip code:

There are several sources of interesting public data that you can use to analyze Retailers: proximity analysis of retailers and their competition using data from Yelp and Foursquare; Correlating retail yelp rating and FourSquare CheckIns against Demographic data; Correlating Weather data against store performance.

Here's the actual live visualization I created with Power View:

You can check out some more examples at my blog.