In modern business intelligence projects, understanding the flow of data from the data source to its destination can be a challenge. The challenge is even bigger if you’ve built advanced analytical projects spanning multiple data sources, artifacts, and dependencies. Questions like “What happens if I change this data?” or “Why isn’t this report up to date?” can be hard to answer. They may require a team of experts or deep investigation to understand. To answer these and other questions we created the lineage view.
Earlier this year we released the dataflows diagram view, and now with the lineage view release, we’re providing you with a full-blown solution for your workspace. The new lineage view covers all Power BI workspace artifacts, including dataflows, datasets, reports, and dashboards and their connections to the external data sources. Moreover, we’ve included some new features, such as gateway information, highlighting the lineage path of a specific artifact, viewing lineage in full screen, and more. All this, with keeping the user experience as a top priority – and so we hope you’ll enjoy it.
Sharon Matthews, Reporting and Analytics Manager at Veolia, who participated in the lineage view private preview program, recently shared with us that “Lineage view has helped us maintain multiple workspaces whilst ensuring that we can clearly see how one change in one dataset can impact multiple reports and dashboards. We can easily identify reports that haven’t refreshed and kick off refreshes from a friendly interface and also remove unused datasets easily. As a central team managing power bi reports this has proven to save us a lot of time”
Let’s see how it works in a real-life scenario
Sometimes troubleshooting can be very challenging, as data may potentially move between workspaces, and understanding the source of the data and finding the root cause of an issue is not an easy thing to do.
In this example, a dashboard is connected to multiple reports while one of the reports is built on top of an external dataset, ‘Contoso Customer 360’. Let’s say that you see, or are told by your manager, that the data displaying in a dashboard tile is out of date. To see the source of this dataset, you can click on the source workspace hyperlink and navigate to the other workspace’s lineage view to continue exploring its upstream artifacts.
What you can see now, is that this dataset is getting its information from multiple dataflows, which also have their own inter-dependencies. The dataflows get data from multiple external data sources like SQL, Odata and Common Data Service for Apps. You can see there’s a refresh failure error in ‘Product Sales’ and ‘Products’ dataflows. From this we can suspect that this is where the issue of the tile not updating is coming from. Clicking on the dataflows card can reveal more information, assisting with troubleshooting and showing you who the owner is. Finally, we can contact the owner of the relevant artifacts and communicate that an issue was found.
Manage your workspace using the lineage view
In addition to the connections between artifacts, lineage view provides you with metadata on the artifacts helping you to effectively manage the workspace, we’ve added all the actions you’ll need to manage your workspace right from the lineage view.
- Options menu available on all artifact types to perform on artifact actions like dataset refresh.
- Clicking on artifact displays a side pane with artifact metadata (e.g., last and next refresh time and status, owner, endorsement, and tables info).
- Data source information, including the connected gateway
Read more in the lineage view documentation.
How to get started
Every workspace, whether new or classic, automatically has a lineage view, except My Workspace.
To access lineage view, go to the workspace list view, tap the arrow next to List view, and select Lineage view.
Build your own lineage view using Power BI Rest APIs
As part of this release, we’re also happy to announce that all the lineage information is available also via Power BI Rest APIs. The APIs are available for both Power BI service admins and other users. If you’re an admin, you have access to all the workspaces metadata, and therefore you can create a cross-workspaces lineage view. Here is the best practice we recommend on to get this information:
- GetGroupsAsAdmin API with $expand to get all artifact information.
- To get lineage between dashboards and reports, use GetDashboardsAsAdmin with $expand=tiles.
- Use GetDatasources for Datasets and GetDatasources for Dataflows to connect datasets and dataflows and their data sources.
- On the new workspaces, a dataflow can be linked to another dataflow. To get this type of connections use GetUpstreamDataflows for Dataflows.
- The API to connect datasets and their dataflows will be live a few weeks.
Learn more about Power BI Rest APIs.
The Power BI team plans to keep investing on addressing more challenges in the data lineage world. This includes:
- Lineage of paginated reports
- Include more metadata on reports and dashboards as part of the lineage view
- Cross workspaces dataset impact analysis
We encourage you to promote your ideas in the Power BI user voice