We’re excited to announce that we’ve reached the final milestone in our dataset scale-out public preview journey! We started the preview without auto-sync and with single read-only replica per dataset. A few months ago, we introduced auto-sync, and now Power BI can create as many read-only replicas as your Power BI capacity supports. Dataset scale-out is no longer limited to a single read-only replica per dataset.
Dataset scale-out is a feature that enables you to create read-only replicas of your datasets and distribute the query load among them automatically. A read-only replica is a dataset copy that can serve queries but cannot be updated by refresh operations. By having multiple read-only replicas, you can reduce the query latency and increase the throughput of your dataset. This is especially useful for scenarios where you have a high number of concurrent users or complex queries that consume a lot of resources.
To use dataset scale-out, you need to have a Fabric, Power BI Premium, or Power BI Embedded capacity that supports this feature. On capacities smaller than a P3, datasets can only have a single read-only replica because the compute power of these capacities typically doesn’t warrant higher replica counts. For P3 and above, datasets can scale to two or more read-only replicas depending on client load and processor utilization.
To enable scale-out for an individual dataset, you must set the maxReadOnlyReplicas parameter in the queryScaleOutSettings to a non-zero value, as demonstrated in our previous announcement of automatic replica synchronization for dataset scale-out. We recommend setting the maxReadOnlyReplicas parameter to a value of -1, so that Power BI can manage the read-only replica count automatically for you. You can verify the current number of read-only replicas at any time by using the /queryScaleOut/syncStatus API, as shown in the following screenshot. This API returns the sync status of each read-only replica:
We’re super excited that dataset scale-out is now feature complete because it can provide many benefits to your Power BI experience, such as:
- Improved performance – By having multiple dataset replicas, you can reduce the query latency and increase the throughput of your datasets. This means faster and smoother reports and dashboards for your users.
- Reduced refresh impact – By having read-only replicas separate from the read-write replica, you can reduce the impact of refresh operations on your reports and dashboards. When you refresh your dataset, Power BI only updates the read-write replica while the read-only replicas continue to serve queries. The subsequent sync operation is fast and transparent with less disruption and more consistency for your users.
If you want to learn more, please visit our documentation page. We hope you enjoy our Dataset Scale-Out feature and we look forward to hearing your feedback.