Skip to main content

Announcing the Public Preview of Power BI Dataset Scale-Out

Headshot of article author Kay Unkroth

Today, we are excited to announce the public preview of Power BI Dataset Scale-Out, a dataset feature enabling enterprise customers to support large-scale Power BI solutions without any additional administrative overhead or infrastructure complexity. The idea is to let Power BI scale the number of dataset replicas and load-balance client connections dynamically to meet query processing demands at critical times up to the maximum available compute resources (vCores) of the underlying Premium capacity. During low-demand times, Power BI can automatically scale back to decrease the number of replicas again. In this way, enterprise customers can easily handle demand peaks in a cost-efficient and worry-free way and maximize their return on investment (ROI) in Power BI Premium compute resources.

Moreover, Power BI Dataset Scale-Out improves data refresh scenarios through refresh isolation in which a separate read-write replica of the dataset is refreshed without disrupting users. Reports and client applications query read-replicas of the same dataset, as the following diagram depicts, ensuring that Power BI users enjoy consistent data and fast and reliable query response times. By not having to concurrently support refresh operations and user queries, customers can also leverage advanced data refresh optimization scenarios associated with the Enhanced Refresh REST API and the TMSL Refresh command, such as to minimize the memory reserve requirements for refreshing a dataset and therefore maximize the dataset size. A subsequent sync operation brings the read-only replicas up to the latest version of the read-write replica.

Power BI Dataset Scale-Out offers the following key benefits:

  1. Cost-efficient dataset scalabilityEnterprise customers can meet even the most critical demand during peak hours at no extra costs up to the maximum available compute resources of their underlying Premium capacities. For example, Power BI can scale out a 20 GB dataset on a P1/A4 capacity to utilize the eight available vCores of the P1/A4 capacity as best as possible. The individual dataset replicas do not count against the max memory limitation per dataset and customers are not charged for the additional memory consumption of the dataset replicas.
  2. Reliable performance for both refresh and query operations   Power BI maintains one read-write replica and additional read replicas and automatically synchronizes these replicas after update and refresh operations, by default. All refresh operations are performed on the read-write replica without impacting the read replicas that Power BI reports and other client applications use. This isolation ensures that refresh operations do not impact query processing and vice versa and delivers in this way maximum performance for both refresh and query operations.
  3. Maximized dataset memory utilizationWhen selecting a Premium SKU, the amount of memory required to refresh a dataset must be accounted for in addition to the target dataset size in memory. This typically limits the dataset size to approximately 50% of the max. available memory per dataset. By lowering the total memory requirements during processing through advanced refresh scenarios enabled via Premium Scale-Out, as mentioned earlier, less memory must be reserved for refresh operations allowing for larger datasets to be hosted on the selected Premium SKU. To find out how much memory is available for each dataset on a Premium capacity, refer to the Capacities and SKUs in the product documentation.

During the public preview phase, workspaces must be enabled manually for Premium Scale-Out by using the following XMLA request (replace the WorkspaceName string with the actual name of your workspace). This implies that you must host these workspaces on Power BI Premium and Power BI Embedded capacities with XMLA read/write operations enabled.

<Execute xmlns="urn:schemas-microsoft-com:xml-analysis">
    <Command>
        <Batch xmlns="http‍://schemas.microsoft.com/analysisservices/2003/engine">
            <Alter ObjectExpansion="ObjectProperties" xmlns="http‍://schemas.microsoft.com/analysisservices/2003/engine">
                <Object />
                <ObjectDefinition><Server xmlns:xsd="http‍://www.w3.org/2001/XMLSchema"
                            xmlns:xsi="http‍://www.w3.org/2001/XMLSchema-instance"
                            xmlns:ddl2="http‍://schemas.microsoft.com/analysisservices/2003/engine/2"
                            xmlns:ddl2_2="http‍://schemas.microsoft.com/analysisservices/2003/engine/2/2"
                            xmlns:ddl100_100="http‍://schemas.microsoft.com/analysisservices/2008/engine/100/100"
                            xmlns:ddl200="http‍://schemas.microsoft.com/analysisservices/2010/engine/200"
                            xmlns:ddl200_200="http‍://schemas.microsoft.com/analysisservices/2010/engine/200/200"
                            xmlns:ddl300="http‍://schemas.microsoft.com/analysisservices/2011/engine/300"
                            xmlns:ddl300_300="http‍://schemas.microsoft.com/analysisservices/2011/engine/300/300"
                            xmlns:ddl400="http‍://schemas.microsoft.com/analysisservices/2012/engine/400"
                            xmlns:ddl400_400="http‍://schemas.microsoft.com/analysisservices/2012/engine/400/400"
                            xmlns:ddl500="http‍://schemas.microsoft.com/analysisservices/2013/engine/500"
                            xmlns:ddl500_500="http‍://schemas.microsoft.com/analysisservices/2013/engine/500/500">
                        <Name>WorkspaceName</Name>
                        <ServerProperties>
                            <ServerProperty>
                                <Name>Feature\PBIP\QueryScaleOut</Name>
                                <Value>1</Value>
                            </ServerProperty>
                        </ServerProperties>
                    </Server>
                </ObjectDefinition>
            </Alter>
        </Batch>
    </Command>
    <Properties />
</Execute>

Display the XMLA request as text

You should also double-check with your Power BI tenant admin that the Premium Scale-Out tenant setting is enabled. If this setting is disabled, then the above workspace configuration has no effect. It will be ignored for as long as the tenant setting remains disabled. Moreover, note that Premium Scale-Out is only available for datasets that use the large dataset storage format. As a best practice, make sure you select the large dataset storage format as the default storage format for your scale-out-enabled workspaces.

For more information and detailed step-by-step instructions to test Premium Scale-Out and refresh isolation, see Premium Scale-Out in the product documentation. The initial public preview limits Premium Scale-Out to one read-write replica and one read replica. Our plan is to increase the replica limit during preview phase and enable Premium Scale-Out for all workspaces by default with the GA release of the feature. So don’t delay and see for yourself how Premium Scale-Out can help you get the most out of your investment in Power BI Premium. And as always please provide us with feedback if you want to help deliver a rock-solid capability that enables enterprise customers to host large-scale solutions on world’s best and most successful BI service – Power BI! We would love to hear from you!