Earlier this year, we introduced Automated Machine Learning (AutoML) in Power BI as Public Preview. Now, we’re happy to announce that AutoML in Power BI is generally available in all public cloud regions where Power BI Premium and Embedded is available.
Using AutoML in Power BI, business analysts without a strong background in machine learning can build ML models to solve business problems that once required data scientists. Most of the data science behind the creation of the ML models is automated by Power BI, while giving visibility into the process used to create your ML model to provide you with full insight. Since AutoML targets analysts who may not have prior experience building ML models, we have made a significant investment in adding automatic guardrails such as class balancing, training-test data split, cross-validation, missing value imputation, and high cardinality feature detection to ensure that the model produced has good quality.
Macaw, a Dutch full-service digital company, deployed automated machine learning in Power BI to quickly ingest sales data and train, validate, and invoke machine learning models directly in Power BI. Dave Ruijter, Principal Consultant Data and AI at Macaw, shared that “The automated functionality within Power BI helps us scale how we infuse our solutions with AI capabilities. Now Macaw Power BI analysts can include machine learning in their solutions without involving a data scientist.” One of their customers, Mitch van Deursen, the Co-owner and Chief Information Officer at Shoeby says, “We now get answers to key business questions within five days, where normally modelling would take months. “ Read about their story here.
With the Public Preview release, AutoML in Power BI enabled users to:
- Train a machine learning model to perform Binary Prediction, General Classification, and Regression
- View the model training report
- Apply the ML model to their data, and view predictions and explanations
Since then, we have been improving and adding new capabilities to AutoML in Power BI.
Binary prediction support for non-Boolean outcomes
Earlier, AutoML expected the outcome field for a binary prediction model to be a Boolean value. We now also support non-Boolean values in the outcome field. In the wizard, you can directly choose the target outcome that you’re most interested in, saving you the preprocessing steps of converting it to Boolean.
Improved Feature Recommendation
We improved the statistical methods that suggest input fields that can be used for training the ML model. Auto ML now analyzes a sample of the selected entity, recommends fields, and shows the reasons for fields that are not recommended. If a certain field has too many distinct values or only one value, or low or high correlation with the output field, it would not be recommended.
Controlling training time
AutoML now allows you to control the time for training a model.
You can choose to decrease the training time to see quick results or increase the amount of time spent in training to get the best model. The former is useful when you are building a POC or for making sure that you have selected the right fields.
Improved training reports
Training reports have been improved to make them more readable. Additionally, reports are now generated two times faster.
Binary prediction reports now include a Cost-Benefit analysis tool. Given an estimated unit cost of targeting and a unit benefit from achieving a target outcome, it helps you identify the subset of the population that should be targeted to yield the highest profit.
Explainable AI
AutoML emphasizes the explainability of predictions to provide visibility into fields that are most important. It provides top predictors during training as well as explanations for each prediction that the ML model produces during scoring.
The Top Predictors section has been improved to show comprehensible feature breakdown so that you can easily validate that the model aligns with your business insights about the outcome field. In the house price prediction example below, the feature breakdown chart for “sqft_living”(on the right) shows that higher “sqft_living” values have higher house prices.
In addition to this, we have added support for text features in top predictors.
Explanations for predictions are now surfaced as a separate entity to make them easily accessible and readable. In order to make the model predictions interpretable, we show the contribution of every feature towards the prediction, and these contributions add up to the predicted value.
In the house price prediction example below, you can see that some features have a positive influence (in green), and other features have a negative influence (in red). Adding these contributions to a base value (average value of the house price in the training data in this case), gives you the predicted house price of $379,738, thus allowing you to easily explain these model predictions.
Using this explanations entity you can quickly build reports explaining model predictions. Automatically generated explanation reports will be available shortly.
Get started building you own ML Models
Here is a step-by-step tutorial that shows how to build your machine learning model using Auto ML. To learn more, refer to the documentation. If you have any questions or would like to get in touch with us about your use cases, pain points, please reach out to automlpowerbi@microsoft.com. We’d love to hear about your experiences, feedback, and ideas on how you’d like to use Auto ML in Power BI.