The dataset API was the first REST API that we ever released for Power BI. It allowed a developer to programmatically create a simple dataset and then push data directly to a table within that dataset. This allowed developers to send data from any type of source to Power BI. Today, we are pleased to announce the preview of two new features that will enhance the way that data can be pushed to Power BI:

  1. Additional modeling properties when generating datasets.

  2. Specifying sequence numbers to guarantee which rows have been successfully pushed and avoiding duplication

 

Additional modeling properties

The current v1 of datasets API only allows for a dataset to be created with a name and a collection of tables. Each table can have a name and a collection of columns. Each column has a name and datatype. With this preview release, we are greatly expanding these properties most notably with support for measures and relationships between tables. The complete list of supported properties is as followed:

Dataset

Name

Type

Description

Read Only

Required

id

Guid

System wide unique identifier for the dataset

True

false

name

String

User defined name of the dataset

False

True

tables

Table[]

Collection of tables

False

False

relationships

Relationship[]

Collection of relationships between tables

False

False

 

Relationship

 

Name

Type

Description

Read Only

Required

name

String

User defined name of the relationship. It is also used as the identifier of the relationship.

False

True

crossFilteringBehavior

String

The filter direction of the relationship

  • OneDirection (default)
  • BothDirections
  • Automatic

False

False

fromTable

String

Name of the foreign key table

False

True

fromColumn

String

Name of the foreign key column

False

True

toTable

String

Name of the primary key table

False

True

toColumn

String

Name of the primary key column

False

True

 

Table

Name

Type

Description

Read Only

Required

name

String

User defined name of the table. It is also used as the identifier of the table.

False

True

columns

column[]

Collection of columns.

False

True

measures

measure[]

Collection of measures

False

False

isHidden

Boolean

 If true, table will be hidden from client tools

False

false

 

Column

Name

Type

Description

Read Only

Required

name

String

User defined name of the column.

False

True

dataType

EDM Datatype

Supported EDM date types and restrictions:

Data type

Restrictions

Int64

Int64.MaxValue and Int64.MinValue not allowed.

Double

Double.MaxValue and Double.MinValue values not allowed. NaN not supported.+Infinity and -Infinity not supported in some functions (e.g. Min, Max).

Boolean

 

Datetime

During data loading we quantize values with day fractions to whole multiples of 1/300 seconds (3.33ms).

String

Currently allows up to 128K characters.

False

True

formatString

String

A string describing how the value should be formatted when it is displayed. https://msdn.microsoft.com/en-us/library/ms146084.aspx

False

False

sortByColumn

String

String name of a column in the same table to be used to order the current column

False

False

dataCategory

String

String value to be used for the data category which describes the data within this column. Some common values include:

  • Address
  • City
  • Continent
  • Country
  • Image
  • ImageUrl
  • Latitude
  • Longitude
  • Organization
  • Place
  • PostalCode
  • StateOrProvince
  • WebUrl

False

False

isHidden

Boolean

Property indicating if the column is hidden from view. Default is false.

False

False

summarizeBy

string

 Default aggregation method for the column. Values include:

  • default
  • none
  • sum
  • min
  • max
  • count
  • average
  • distinctCount

 False

False

 

Measure

Name

Type

Description

Read Only

Required

name

String

User defined name of the measure.

False

True

expression

String

A valid DAX expression

False

True

formatString

String

A string describing how the value should be formatted when it is displayed. https://msdn.microsoft.com/en-us/library/ms146084.aspx

False

False

isHidden

String

 If true, table will be hidden from client tools

False

false

 

A sample dataset that uses just some of these properties would look like this:

{

  "name": "PushAdvanced",

  "tables": [

    {

      "name": "Date",

      "columns": [

        {

          "name": "Date",

          "dataType": "dateTime",

          "formatString": "dddd\\, mmmm d\\, yyyy",

          "summarizeBy": "none"

        }

      ]

    },

    {

      "name": "sales",

      "columns": [

        {

          "name": "Date",

          "dataType": "dateTime",

          "formatString": "dddd\\, mmmm d\\, yyyy",

          "summarizeBy": "none"

        },

        {

          "name": "Sales",

          "dataType": "int64",

          "formatString": "0",

          "summarizeBy": "sum"

        }

      ],

      "measures": [

        {

          "name": "percent to forecast",

          "expression": "SUM(sales[Sales])/SUM(forecast[forecast])",

          "formatString": "0.00 %;-0.00 %;0.00 %"

        }

      ]

    },

    {

      "name": "forecast",

      "columns": [

        {

          "name": "date",

          "dataType": "dateTime",

          "formatString": "m/d/yyyy",

          "summarizeBy": "none"

        },

        {

          "name": "forecast",

          "dataType": "int64",

          "formatString": "0",

          "summarizeBy": "sum"

        }

      ]

    }

  ],

  "relationships": [

    {

      "name": "2ea345ce-b147-436e-8ac2-9d3c4d82af8d",

      "fromTable": "sales",

      "fromColumn": "Date",

      "toTable": "Date",

      "toColumn": "Date",

      "crossFilteringBehavior": "bothDirections"

    },

    {

      "name": "5d95f419-e589-4345-9581-6e70670b1bba",

      "fromTable": "forecast",

      "fromColumn": "date",

      "toTable": "Date",

      "toColumn": "Date",

      "crossFilteringBehavior": "bothDirections"

    }

  ]

}

 

You can try creating a new dataset without writing any code at: http://docs.powerbi.apiary.io/#reference/datasets-preview/datasets-collection/create-a-dataset.

The new dataset is current only available in preview. It can be accessed at https://api.powerbi.com/beta/myorg/datasets. Please leave any feedback at http://community.powerBI.com.

Sequence numbers to guarantee which rows have been successfully pushed

When pushing rows to a table in Power BI, you will receive a 200 OK response when the rows are successfully loaded to Power BI, and some error status code when they are not. However, what about those rare cases when you don’t receive a response at all due to some network or timeout issue? Should you push the data again? Before now, there was no way to know. Now, with sequence numbers, you always know what state your data is in. Sequence numbers are assigned to a set of rows that you are pushing.  If you are unsure if the previous call succeeded, send with the same Sequence Number and we’ll determine if the call is a duplicate or not.

You can take advantage of sequence numbers by adding the “X-PowerBI-PushData-SequenceNumber” header to your “POST rows” call:

This header takes a 64-bit integer, which you should increase in value each time you post new rows. Power BI will save the most recent sequence number with your data. As long as each of your “POST rows” requests has a higher sequence number than what is stored with the data, the request will succeed. If you are unsure if your request succeeded or not, you should resend it with the same sequence number as the original request. If the original request had succeeded, the second request would fail with a 412 “Precondition failed” error:

Then you can move on to the next set of data with a new sequence number. If you need to discover the current sequence number which is stored with your table data, you can make the following API call on the table:

The response body will look something like this:

Try the API now at: http://docs.powerbi.apiary.io/#reference/datasets/table-sequence-number.

The above approach works great if you have a single device or client pushing to a table in Power BI. However, if you have multiple devices or clients pushing to the same table at the same time you can add the “X-PowerBI-PushData-ClientId” header and a client ID to maintain a separate sequence number for each client. Use of this header is optional.

Sequence numbers are support in both the v1 and beta versions of the API.

To try all the APIs mentioned in this article, please visit: http://docs.powerbi.apiary.io. For more information on developing for Power BI, see: http://dev.powerbi.com.