In modern BI projects, understanding the flow of data from the data source to its destination, and understanding the data transformation done along the way, is a key challenge for many customers.
This challenge becomes even bigger if you have built advanced analytical projects spanning multiple data sources, artifacts, and dependencies. Questions like “What happens if I change this data?” or “Why isn’t this report up to date?” are difficult to answer, and sometimes requires a team of experts or deep investigation to understand.
Today we’re excited to introduce a first step in Power BI to address these challenges, by providing a way to view the dependencies and connections between your dataflows.
Let’s see how it works in a real-life scenario
Our data lineage diagram view provides a visual representation for understanding a multi-stage ETL process (created with Linked entities) defined with dataflows.
In the example shown above, data is ingested from 3 data sources – Common Data Service for Apps (Dynamics), service call data from a blob storage, and website telemetry from the web.
By looking into the graph, you can easily see how the data moves from the data source to the dataflows, and then how it feeds the linked dataflows in the downstream, until it gets to the Customer 360 view.
You can further explore the data and drill down to view the details.
By clicking on the dataflow node, you can see what entities this dataflow contains.
In the above example, you can see that the ‘Ingest Dynamics data’ dataflow which gets data from Common Data Service for Apps contains 4 entities: Account, Lead, Opportunity and Product.
By clicking the link between dataflows, you can see what entities move through one dataflow to another.
In this example, above, you can see that data from ‘Services raw data’ moves to ‘Support Calls Agg’ for further computation.
We’ve also introduced the following new features:
- Show data sources display control: You can switch on and off the display of data sources (e.g. Azure SQL) on this view.
- Several layout options: In addition to the default layout (left to right), you can also change the layout to any of the following: top to bottom, bottom to top, right to left.
- Zoom functionality: We’ve added a zoom feature, using the header controller or the mouse scroller. Now you can zoom in and out to view fewer or more graph nodes and focus on in the information that matters most to you.
- Entities list and entity editing: Clicking the entities number on each node opens the entities list view side panel. Then you can edit each entity by clicking on the edit icon (pencil) that appears while hovering over the entity.
- Drag and drop: Use drag and drop to pan right, left, up, and down inside the graph canvas.
How to get started
It only takes one click.
To see the data lineage view, in an app workspace, under the dataflows tab, change the view mode from “List view” to the new “Diagram view”. Once you change the view it becomes the default (cached on the browser that you use). The next time you open that workspace, you’ll automatically land on the experience that’s most suitable.
You can think about this view as an alternative to the traditional list view, and as such, all the actions on the data or metadata that you have in the list view are available in this view as well.
This release is a first step in the Power BI team’s bigger plans to address more challenges in Power BI lineage, in the layer of dataflows, as well as datasets, reports, and dashboards. This includes plans to extend lineage capabilities to address more challenges in the data lineage world.
We encourage you to promote your ideas in the Power BI user voice.