⚠ Note
This post is archived. Content may be outdated, and links may no longer function. Please reach out to our community if you have questions about specific posts.
The dataset API was the first REST API that we ever released for Power BI. It allowed a developer to programmatically create a simple dataset and then push data directly to a table within that dataset. This allowed developers to send data from any type of source to Power BI. Today, we are pleased to announce the preview of two new features that will enhance the way that data can be pushed to Power BI:
- Additional modeling properties when generating datasets.
- Specifying sequence numbers to guarantee which rows have been successfully pushed and avoiding duplication
Additional modeling properties
The current v1 of datasets API only allows for a dataset to be created with a name and a collection of tables. Each table can have a name and a collection of columns. Each column has a name and datatype. With this preview release, we are greatly expanding these properties most notably with support for measures and relationships between tables. The complete list of supported properties is as followed:
Dataset
Name | Type | Description | Read Only | Required |
id | Guid | System wide unique identifier for the dataset | True | false |
name | String | User defined name of the dataset | False | True |
tables | Table[] | Collection of tables | False | False |
relationships | Relationship[] | Collection of relationships between tables | False | False |
Relationship
Name | Type | Description | Read Only | Required |
name | String | User defined name of the relationship. It is also used as the identifier of the relationship. | False | True |
crossFilteringBehavior | String | The filter direction of the relationship
| False | False |
fromTable | String | Name of the foreign key table | False | True |
fromColumn | String | Name of the foreign key column | False | True |
toTable | String | Name of the primary key table | False | True |
toColumn | String | Name of the primary key column | False | True |
Table
Name | Type | Description | Read Only | Required |
name | String | User defined name of the table. It is also used as the identifier of the table. | False | True |
columns | column[] | Collection of columns. | False | True |
measures | measure[] | Collection of measures | False | False |
isHidden | Boolean | If true, table will be hidden from client tools | False | false |
Column
Name | Type | Description | Read Only | Required | ||||||||||||
name | String | User defined name of the column. | False | True | ||||||||||||
dataType | EDM Datatype | Supported EDM date types and restrictions:
| False | True | ||||||||||||
formatString | String | A string describing how the value should be formatted when it is displayed. https://msdn.microsoft.com/en-us/library/ms146084.aspx | False | False | ||||||||||||
sortByColumn | String | String name of a column in the same table to be used to order the current column | False | False | ||||||||||||
dataCategory | String | String value to be used for the data category which describes the data within this column. Some common values include:
| False | False | ||||||||||||
isHidden | Boolean | Property indicating if the column is hidden from view. Default is false. | False | False | ||||||||||||
summarizeBy | string | Default aggregation method for the column. Values include:
| False | False |
Measure
Name | Type | Description | Read Only | Required |
name | String | User defined name of the measure. | False | True |
expression | String | A valid DAX expression | False | True |
formatString | String | A string describing how the value should be formatted when it is displayed. https://msdn.microsoft.com/en-us/library/ms146084.aspx | False | False |
isHidden | String | If true, table will be hidden from client tools | False | false |
A sample dataset that uses just some of these properties would look like this:
{ “name”: “PushAdvanced”, “tables”: [ { “name”: “Date”, “columns”: [ { “name”: “Date”, “dataType”: “dateTime”, “formatString”: “dddd\\, mmmm d\\, yyyy”, “summarizeBy”: “none” } ] }, { “name”: “sales”, “columns”: [ { “name”: “Date”, “dataType”: “dateTime”, “formatString”: “dddd\\, mmmm d\\, yyyy”, “summarizeBy”: “none” }, { “name”: “Sales”, “dataType”: “int64”, “formatString”: “0”, “summarizeBy”: “sum” } ], “measures”: [ { “name”: “percent to forecast”, “expression”: “SUM(sales[Sales])/SUM(forecast[forecast])”, “formatString”: “0.00 %;-0.00 %;0.00 %” } ] }, { “name”: “forecast”, “columns”: [ { “name”: “date”, “dataType”: “dateTime”, “formatString”: “m/d/yyyy”, “summarizeBy”: “none” }, { “name”: “forecast”, “dataType”: “int64”, “formatString”: “0”, “summarizeBy”: “sum” } ] } ], “relationships”: [ { “name”: “2ea345ce-b147-436e-8ac2-9d3c4d82af8d”, “fromTable”: “sales”, “fromColumn”: “Date”, “toTable”: “Date”, “toColumn”: “Date”, “crossFilteringBehavior”: “bothDirections” }, { “name”: “5d95f419-e589-4345-9581-6e70670b1bba”, “fromTable”: “forecast”, “fromColumn”: “date”, “toTable”: “Date”, “toColumn”: “Date”, “crossFilteringBehavior”: “bothDirections” } ] } |
You can try creating a new dataset without writing any code at: https://docs.powerbi.apiary.io/#reference/datasets-preview/datasets-collection/create-a-dataset.
The new dataset is current only available in preview. It can be accessed at https://api.powerbi.com/beta/myorg/datasets. Please leave any feedback at https://community.powerBI.com.
Sequence numbers to guarantee which rows have been successfully pushed
When pushing rows to a table in Power BI, you will receive a 200 OK response when the rows are successfully loaded to Power BI, and some error status code when they are not. However, what about those rare cases when you don’t receive a response at all due to some network or timeout issue? Should you push the data again? Before now, there was no way to know. Now, with sequence numbers, you always know what state your data is in. Sequence numbers are assigned to a set of rows that you are pushing. If you are unsure if the previous call succeeded, send with the same Sequence Number and we’ll determine if the call is a duplicate or not.
You can take advantage of sequence numbers by adding the “X-PowerBI-PushData-SequenceNumber” header to your “POST rows” call:
This header takes a 64-bit integer, which you should increase in value each time you post new rows. Power BI will save the most recent sequence number with your data. As long as each of your “POST rows” requests has a higher sequence number than what is stored with the data, the request will succeed. If you are unsure if your request succeeded or not, you should resend it with the same sequence number as the original request. If the original request had succeeded, the second request would fail with a 412 “Precondition failed” error:
Then you can move on to the next set of data with a new sequence number. If you need to discover the current sequence number which is stored with your table data, you can make the following API call on the table:
The response body will look something like this:
Try the API now at: https://docs.powerbi.apiary.io/#reference/datasets/table-sequence-number.
The above approach works great if you have a single device or client pushing to a table in Power BI. However, if you have multiple devices or clients pushing to the same table at the same time you can add the “X-PowerBI-PushData-ClientId” header and a client ID to maintain a separate sequence number for each client. Use of this header is optional.
Sequence numbers are support in both the v1 and beta versions of the API.
To try all the APIs mentioned in this article, please visit: https://docs.powerbi.apiary.io. For more information on developing for Power BI, see: https://dev.powerbi.com.