Skip to main content

Visualizing and interacting with your Azure Machine Learning Studio experiments

Headshot of article author The Power BI Team

Microsoft Senior Program Manager Christian Berg is back with another entry in his series on becoming your organization’s strategic advisor with Machine Learning and Power BI. If you need to catch up, you can find the five previous chapters summarized here on the blog.

—————————————————————-

Visualizing and exploring the results of your Azure Machine Learning Studio (ML Studio) experiments is useful both when developing and evaluating the model but most importantly when deploying your model and presenting the results. With Power BI Desktop this can be done in two ways, either by writing the results from the model to a database that you then connect Power BI to, or by using an R script visual (Rviz). Using an Rviz has the advantage that you can dynamically select the subset of the Power BI model that you want to score. This enables powerful What-If Analysis scenarios where you can use R and ML Studio to assess the distribution of likely outcomes based on the counterfactuals.

In this example we will look at connecting to an Azure ML Studio experiment with an Rviz and then building on that to create a dynamic report to explore cross price elasticities. We will also look at a simpler example where we instead use DAX to explore the impact of different discount percentages, based on an assumption about our elasticity.

For the example report we have an ensemble regression model that was built in ML Studio to predict revenue by store, product and month using historic data, marketing information and future prices. The timeseries component is pre-calculated from part two.

 

Using the example report

For this report to work in Power BI Desktop you need the packages AzureML and ggplot2 installed. If you have not previously used these packages on your computer, install them by going to your R console in R Studio or other R GUI and copy/paste install.packages(“AzureML “), and repeat that for “ggplot2”. If you are new to R, please see my previous post for installation instructions. Please note that AzureML uses networking services which are currently blocked in the Power BI service.

When you open the example report and go to page 1 (Simple what-if analysis) you will see a line chart showing the historic revenue in the data model and the hypothetical revenue amount, based on the average discount and price elasticity parameters. Below the line chart you will find the breakdown of both amounts in a drillable matrix visualization.

clip_image002

In the top right-hand side corner are the two parameters Average discount (applied across all products) and Price elasticity. By changing these two parameters you will see the other visualizations’ measures beginning with “new” updating. While these are straightforward relationships this can be powerful to allow the report’s users to develop an intuitive understanding for the underlying sensitivities. Especially when there is little prior knowledge about the variation in inputs but the relationships between the inputs and relevant KPIs are well understood and modeled.

On the second page, Azure ML Studio real-time scoring, we are not inputting any assumptions but instead using slicers and Power BI’s filter functions to dynamically select a subset of the data model to be sent to our Predictive ML Studio experiment.

clip_image004

The three slicers are followed by the Rviz which plots the actual revenue and the model’s predicted value. To the right of it is a normal line chart showing the historical values only. Unlike the previous page where we had to estimate elasticity this model was trained on all available data and uses the price points of the six products (P1-P6) to predict what the revenue would be. However, we are limited to only using real data for the scoring.

On the third page (What-If Analysis with Azure ML Studio) we have combined the previous two pages using both input parameters and real-time scoring. This allows us to test different price points per product with the Azure ML experiment. Just as on the previous page there is no need to estimate elasticity since the model has learnt this relationship through the historic data and by adding parameters for marketing campaigns etc. we can now assess the probable range of any scenario of interest.

clip_image006 To create a scenario, start by selecting the regions, stores and products that you are interested in using the corresponding slicers. Then select the time interval that you are interested in (this could have been exclusively or partially future dates). By changing the planned price on the different products, you can then examine which combination has the highest expected revenue or profit. This model takes cross-price elasticities into account, i.e. if one product is a complement or supplement to another product a change in its price will also affect units sold for the other product. In summary, a powerful tool to quickly compare the average Azure ML Studio prediction of the impact of your scenario against the base case.

 

Recreating the report

 

PAGE 1

To build what-if like dynamic reports in Power BI start by creating the parameters that the user should be able to input by clicking Modelling -> New Parameter in Power BI Desktop top menu.

image

This prompts a form that asks for the new parameter’s name, e.g. “Average discount”, data type, min and max values, the granularity of the steps that the slider should use and an optional default value. When you click OK a new table will be created with a calculated column and corresponding measure. The column contains the DAX formula that generates the series and you can double-click on it at any time to change the settings for the parameter. Repeat this for all the parameters that you want to let the user alter to define the scenario. For categorical values, e.g. “Social campaign”, “Display ads”, “Email” you can either Enter data or reference a new or existing table in the Query editor and then displaying it with a slicer. The what-if parameters should typically not be connected to any other tables, so you need to use DAX to use them. In this example I did this with the measure:

New Units_sold = SUM(Sales[Units_sold])*(1+[Average discount Value]*[Price eleasticity Value])

which changes the baseline quantity of products sold (Sales[Units_sold]) by the discount parameter, weighted by the elasticity. The measure:

New price = DIVIDE(SUM(Sales[Revenue]),SUM(Sales[Units_sold]))*(1-[Average discount Value])

calculates the current average price and the applies the average discount in the scenario to it. We can then use these two calculations to determine the new revenue:

New revenue = [New price]*[New Units_sold]

and then by adding these measures to our visualizations the report is complete allowing the user to immediately see the results of their inputs. For stochastic modelling you can use a randomization function like RAND() or Poisson.Dist() to determine for instance the probability of achieving a target based on the parameter input.

Below is an example of this first page, embedded as a publish to web report:

 

PAGE 2

To create the R based visuals on page 2 & 3, you will need an ML Studio based experiment that is deployed and the AzureML and ggplot2 R packages installed. Please note that since the AzureML R package uses networking capabilities it is not yet available in the service, i.e. only works on-premise.

clip_image011 clip_image013

These two visualizations are Power BI R script visuals that uses the R packages AzureML and ggplot2*. The former connects to an MLStudio experiment. For it to do that the scoring experiment needs to be exhaustively defined. In this example we will use the 1) Workspace ID and 2) Experiment name. In addition, the package needs the 3) Authentication information to be allowed to interact with the web service.

You can find the Workspace ID by signing in to studio.azureml.net and selecting the correct workspace from the top menu:
image

You then select the SETTINGS tab from the left-hand side menu. The WORKSPACE ID is the fourth property from the top:

image

Copy this information, e.g. to Notepad.

To get the AUTHORIZATION TOKENS click on the second tab in settings:

image

And copy one of the tokens to a new row in your Notepad.

The last thing that you need is the exact name of the predictive experiment. Once you have published it, you can get this by clicking on WEB SERVICES in the left-hand side menu and then on the name of the experiment in the list and you will see something like:

image

Copy the entire name, e.g. “pricingdevpoc [predictive exp.]” to your Notepad.

Now that you have the required connection information it is time to add the visuals to Power BI. Open a pbix file with the relevant data (typically a subset of the data that you used to train your model on). On the page that you want to add the interactive scoring: insert a simple standard visual, like a table and add the columns that you want to send to MLStudio. Please note that MLStudio requires an exact schema so the column names, sequence and type have to be identical to what the Web service input expects. I would also recommend that you click on the arrow next to each column name in the Values section for the visual and select Don’t summarize, to ensure that it looks correct without any empty rows or text values mixed with numbers etc.

Once this has been validated convert the visual to an R visual and add the below script updated with your experiments connection information:

image

The R script editor where you past the script is visible when you select the converted Rviz from the canvas:

image

Update the below script* with your connection details and paste it into the R script editor:

## Set the workspace id

wsid = "paste your workspace id here within the quotation marks"

## Set the workspace authentication key

auth = "paste your authentication token here within the quotation marks "

## Input the name of the Experiment

serviceName = "paste your service’s name here within the quotation marks "

## Load the AzureML library if not previously installed: install.packages("AzureML")

library("AzureML")

## Create a reference to the workspace

ws <- workspace(wsid,auth)

## Create a reference to the experiment

s <- services(ws, name = serviceName)

## Send the dataset to the Azure ML web service for scoring and store the result in ds

ds <- consume(s,dataset)

## Aggregate the scores to a single value by month

scores <- data.frame(Prediction = tapply(ds$Scored.Labels, ds$Month_ID, sum))

## Aggregate the revenue to a single value by month (for comparison)

revenue <- data.frame(Actuals = tapply(ds$Revenue, ds$Month_ID, sum))

## Combine the two resulting vectors in the new data.frame timePlot

timePlot <- cbind(scores, revenue)

## Load the ggplot library if not previously installed: install.packages("ggplot2")

require(ggplot2)

## Specify the data to plot and set the x-axis

ggplot(data = timePlot, aes(x = 1:nrow(timePlot))) +

## Plot the two lines

geom_line(aes(y = Prediction, colour = 'Prediction')) +

geom_line(aes(y = Actuals, colour = 'Actuals')) +

## Rename the x and y axis

xlab("Time") +

ylab("Result $") +

## Name the legend

labs(colour="Legend") +

## Change the colors of the line

scale_color_manual(values = c("green", "red"))

clip_image035

After pasting it in the editor, click the Run script play button. If you see an error message, click See details. If the message is “there is no package called…” install the required package, e.g. install.packages(“AzureML”) from your IDE for R, (RGui, RStudio etc.).

Once you have gotten this visual to work you can move on to creating a similar visualization which takes user defined parameters as input. Please see the beginning of this section for information on creating What-If parameters.

 

PAGE 3

Duplicate the page that you just created and substitute the columns from the data model with the corresponding parameter. If the parameters have the same names as the input schema it should work. In my example on page three I’ve given the parameters slightly different names to what the schema expects. This can be fixed by renaming them at runtime in the Rscript*. Below are two examples of how to do this, by name and by position:

## Rename the columns according to the import schema

## By name

names(dataset)[names(dataset)=="Product1 Price Value"] <- "P1"

## By position

names(dataset)[15] <- "P2"

names(dataset)[16] <- "P3"

names(dataset)[17] <- "P4"

names(dataset)[18] <- "P5"

## Set the workspace id

wsid = "paste your workspace id here within the quotation marks"

## Set the workspace authentication key

auth = "paste your authentication token here within the quotation marks "

## Input the name of the Experiment

serviceName = "paste your service’s name here within the quotation marks "

## Load the AzureML library if not previously installed: install.packages("AzureML")

library("AzureML")

## Create a reference to the workspace

ws <- workspace(wsid,auth)

## Create a reference to the experiment

s <- services(ws, name = serviceName)

## Send the dataset to the Azure ML web service for scoring and store the result in ds

ds <- consume(s,dataset)

## Aggregate the scores to a single value by month

scores <- data.frame(Prediction = tapply(ds$Scored.Labels, ds$Month_ID, sum))

## Aggregate the revenue to a single value by month (for comparison)

revenue <- data.frame(Actuals = tapply(ds$Revenue, ds$Month_ID, sum))

## Combine the two results vectors in the new data.frame timePlot

timePlot <- cbind(scores, revenue)

## Load the ggplot library if not previously installed: install.packages("ggplot2")

require(ggplot2)

labelNudge <- (max(scores) - min(scores))/15

## Specify the data to plot and set the x-axis

ggplot(data = timePlot, aes(x = 1:nrow(timePlot))) +

## Plot the two lines

geom_line(aes(y = Prediction, colour = 'Prediction')) +

geom_line(aes(y = Actuals, colour = 'Base case')) +

## Rename the x and y axis

xlab("Time") +

ylab("Result $") +

## Name the legend

labs(colour="Legend") +

## Change the colors of the lines

scale_color_manual(values = c("blue", "red")) +

geom_text(aes(label = format(round(Prediction, digits = 0), big.mark = ",", big.interval = 3L), y=Prediction), nudge_y = labelNudge)

The other difference to the first Rscript are small layout changes.

 

Links and downloads

studio.azureml.net

https://1drv.ms/u/s!Aq0_YqoiHrN3lv45rM5Zq5nZmG1qMA

 

* Third-party programs. This software enables you to obtain software applications from other sources. Those applications are offered and distributed by third parties under their own license terms. Microsoft is not developing, distributing or licensing those applications to you, but instead, as a convenience, enables you to use this software to obtain those applications directly from the application providers. By using the software, you acknowledge and agree that you are obtaining the applications directly from the third-party providers and under separate license terms, and that it is your responsibility to locate, understand and comply with those license terms.

Microsoft grants you no license rights for third-party software or applications that is obtained using this software.