Last year in December we announced new Scanner APIs for extracting tenant-level metadata using Power BI Admin REST APIs. Already we see many customers using the new Scanner APIs to query Power BI in order to build their own reporting and homegrown catalogs. We’ve designed the APIs to support scaling for big tenants, while at the same time allowing us to add more metadata to the API response as time goes by. The huge enhancement to the APIs we’re announcing today includes the items our customers have been asking for most – now, as part of the API response, you can get the metadata of dataset tables and columns, measures, DAX expressions, and mashup queries.
There are variety of cases where this information can be useful. As Microsoft MVP Rui Romano told us:
One of the fundamental pillars of good Power BI Governance is discoverability & monitoring. The inclusion of sub-artifacts (tables, columns, measures, etc.) on Scanner APIs not only allows users to quickly catalog all the metadata from datasets on their Power BI tenant, it’s also a great feature for “lazy” Power BI developers like me who hate to write documentation that constantly gets outdated. By using this data we can deliver a Power BI Report to end users that serves as auto-documentation by allowing them to search for a measure/table on the entire tenant and not only see the description but also the DAX code of the measure to better understand the logic behind it.
The image below shows an example of a solution Rui built using the enhanced dataset schema of the Scanner APIs.
Power BI team collaboration with 1st and 3rd party catalog products
Any catalog tool that integrates, or is planning to integrate, with Power BI can benefit from the new improvements to the APIs.
The Power BI team has strong relationships with the market’s leading catalog products, and these products have already integrated the new scanner APIs to get the new, enhanced metadata. Here are a few of the things you’ll be seeing in these leading products soon – all made possible thanks to the enhanced API responses we’re announcing today.
- High granularity search: For instance, you can find a dataset by searching for a measure.
- DAX expression visibility: For any measure or calculated column, you can view the DAX expression.
- Detailed lineage: For instance, since you’ll be able to retrieve the mashup query associated with each dataset table, you can see which columns in which datasets a column in a SQL table is connected to.
If you use or plan to use Azure Purview, Informatica, Collibra, or Alation, you’ll find the Power BI integration already available in these products. And since the scanner APIs are public, we’re confident that other catalog products have integrated with Power BI and are using the APIs as well..
How does it work behind the scenes?
We use a caching mechanism to make sure your capacity resources are not impacted. Getting dataset details such as table and column names requires loading the model into memory. To make sure we don’t overload Power BI shared or Premium capacity, we cache the model information any time it is already in memory, rather than loading it into memory for each API call. Thus every time a dataset is refreshed successfully or re-published, and the model is in memory, caching is performed. Then, upon every scan API call, we check if we have the dataset information in the cache, and if so, it is returned as part of the API response.
Caching happens every successful dataset refresh and republish, but only if the following conditions are met:
- The Enhance Admin APIs responses with detailed metadata (Preview) admin tenant setting is enabled (see Enabling enhanced metadata scanning).
- There has been a call to the scanner APIs within the last 90 days.
If the detailed low-level metadata requested is not in the cache, it is simply not returned. High-level metadata, such as dataset name, is always returned, even if the low-level detail is not available.
Enabling enhanced metadata scanning
Go to Admin portal > Tenant settings, find the Admin API settings section, and enable the following two new feature switches.
- Enhance admin APIs responses with detailed metadata (Preview): This setting turns on the caching flow and enhances API responses with low-level metadata (for example, name and description) for tables, columns, and measures.
- Enhance admin APIs responses with DAX and mashup expressions (Preview): This setting allows the API response to include DAX expressions and Mashup queries. This setting can only be enabled if the first setting is also enabled.
See our documentation for more detail.
Scanning using the new enhanced Admin APIs involves just few simple steps.
- First you decide which authentication method you’d like to use. Both service principals and Power BI service Admin delegated tokens are supported.
- Next, you perform a full scan to get all the workspaces and the metadata and lineage of their assets.
- Subsequently, you perform incremental scans to only get workspaces that have changed since the previous
See our documentation for a walkthrough that demonstrates the flow in more detail.
No special license is required for using the enhanced scanner APIs. It works for all of your tenant metadata, including non-Premium workspaces.