In September, we announced the availability of an enhanced Connector to Azure Databricks in Power BI Desktop in public preview. Today, we are excited to announce that the Azure Databricks connector has also been deployed to the Power BI service. You can now share your Power BI reports based on Azure Databricks with others by publishing them in the Power BI service.
Azure Databricks is an Apache Spark-based analytics platform optimized for the Microsoft Azure cloud services platform. It can consume data at cloud scale from multiple data sources such as Azure Blob Storage, Azure Data Lake Storage, and Azure Cosmos DB. And thanks to the enhanced Azure Databricks connector, you can now deliver breakthrough insights at cloud scale in a self-service fashion in Power BI. You don’t need to deploy a data gateway. Power BI can connect directly, cloud-to-cloud.
The new connector greatly simplifies the user experience of connecting to Databricks from Power BI. Users can use their Azure Active Directory (AAD) credentials. And in the Power BI service, you can enable single sign on (SSO), as the following screenshot illustrates, so that report users also use their own AAD credentials when accessing Databricks in DirectQuery mode. Moreover, you can promote your datasets in Power BI to make it even easier to share insights, encourage reuse, and preserve a single version of the truth.
The enhanced Azure Databricks connector delivers the following capabilities:
- Native connection configuration in Power BI Desktop The new Databricks connector is natively integrated into PowerBI. Power BI Desktop users can simply pick Azure Databricks as a data source, authenticate once using AAD, and start building their reports.
- DirectQuery support DirectQuery mode enables Power BI users to consume data at cloud scale from Azure Databricks. Your users can now query and visualize large datasets without the size limitations imposed by data imports.
- Direct cloud-to-cloud connectivity in the Power BI Service As mentioned, you don’t need to deploy a data gateway to connect Power BI to Azure Databricks.
- Support for SSO in the Power BI Service Because the data remains in Azure Databricks in DirectQuery mode, you can enforce data lake-level security controls. SSO ensures that your users access Databricks with their own credentials. There is no need to duplicate security controls at the level of the PowerBI dataset.
- New ODBC driver The Azure Databricks connector is based on a new Databricks ODBC driver, which comes with significant performance improvements. Among other things, this driver reduces connection and query latencies, increases result transfer speed based on Apache Arrow serialization, and improves metadata retrieval performance.
The enhanced Azure Databricks connector is the result of an on-going collaboration between the Power BI and the Azure Databricks product teams. Go ahead and take this enhanced connector for a test drive to improve your Databricks connectivity experience and provide us with feedback if you want to help deliver additional enhancements. We would love to hear from you!
Additional information:
- September 2020 feature summary introducing the Azure Databricks connector https://powerbi.microsoft.com/blog/power-bi-september-2020-feature-summary/#Azure_Databricks
- DirectQuery mode in Power BI https://docs.microsoft.com/power-bi/connect-data/desktop-directquery-about
- Reusing your datasets in the Power BI service from Power BI Desktop https://docs.microsoft.com/power-bi/connect-data/desktop-report-lifecycle-datasets