Demystifying Power BI Q&A – Part 1
Microsoft Thursday, February 27, 2014
By now, you’ve probably seen a demo of Power BI Q&A and may have played with one of our sample models. So at this point, you’re probably thinking one of two things: Either you’re amazed by how smart the system is and are convinced it can effortlessly answer nearly any question about any data; or you don’t believe such a thing is possible and are certain it only works in carefully designed toy models after an army of PhD linguists have hand-tuned the system for years.
The truth is somewhere in between.
While we do have a team of natural language wizards at our disposal, their magic is reserved for building the core capabilities of the system, rather than optimizing for any specific model. What they’ve built is essentially a natural language understanding engine based on traditional search and pattern matching techniques, enhanced with:
- Heuristic matching based on the structure of the model
- General understanding of English syntax
- Pre-built business intelligence commands, and
- Customizable domain-specific knowledge of the specific kinds of language used in the target model.
These core capabilities can be applied to any data in a Power Pivot model. However, regardless of how much intelligence we build into it, there are clear limits to the abilities of any system to interpret both the data and the questions targeted at it. It can’t calculate “total sales by year” from a pile of customer invoice emails. And it won’t be able to tell you “why sales didn’t go up last year”, even if you kept meticulous records.
But rather than just talking about what it can’t do, let’s tell you what it can do.
In this series of articles, we’ll be walking you through details of what you can expect Q&A to understand and some things you can do to both improve the answers and make a wider variety of questions work. Today, we’ll describe what Q&A can do out of the box on raw data with little or no effort. In Part 2 next week, we’ll teach you about basic model optimization beneficial both for building and consuming Power View reports as well as for Q&A. Finally, in Part 3, we’ll give you an early preview of the creation of phrasing rules, which are used to reduce ambiguity and introduce complex terms.
Imagine you need to publish some data for your users to query and build reports against, but you’re up against a tight deadline or running late for a meeting or just plain feeling lazy. So instead of your normal careful scrubbing of the data, cleaning of the schema, and so on, you just quickly import your data into Power Pivot and upload the workbook into your Power BI site. Then almost as an afterthought, you enable it for Q&A. What happens?
Fortunately, Q&A has a set of core natural language understanding abilities which work across every Power Pivot model. First, it has context-dependent keyword search capabilities for both data and metadata. Second, it has a set of built-in knowledge for how to filter, sort, aggregate, group and display data. While Q&A will definitely work better against models that have been optimized for Power View and best against models annotated with phrasing rules, these two primary capabilities apply to all models. Note: In the examples below, we did one such optimization, namely setting the Default Fields on some tables to avoid showing too many columns by default.
Basic keyword search
Columns and Tables
With a raw data import, everything depends on the names of the tables and columns, so hopefully your source data had sensible naming conventions. Q&A detects obvious word breaks and plurals, to allow search terms such as “contact name” to match the column ContactName or “retail customer” to match the table retail_customers. It does not yet, however, try to guess the meaning behind abbreviated names like AvgNetPrft, or ones with nonobvious word breaks like pickuptime, or automatically search related terms such as matching “client” to a table named Customer.
Q&A will match string data values it finds in the model. These searches are case insensitive, and will match either the exact word typed or the singular form of that word. For example, “London filo mix orders”, “London Filo Mix orders” and “london orders for filo mixes” will all find orders in London for the product named Filo Mix
For date and year values, Q&A will make a best guess for which column to match against, based on the column data types.
When multiple tables or columns are referenced in a query, Q&A utilizes the implicit relationships between a table and the columns it contains (in queries like “customers and their addresses”) and the explicit relationships defined in the model between tables (in queries like “list customers and their orders”).
Distantly related things can be used in a single query, so long as there exists a set of relationships between them.
Contextual ambiguity resolution
Q&A’s keyword search capabilities will attempt to use context to resolve ambiguity caused by duplicate column names or duplicate values. For example, “customers 2012 London phone number” will match “London” to customer city rather than employee city and “phone number” to customer phone number rather than employee phone number.
Anything can be sorted by any related column:
And sorted in a chosen direction:
Column values can be explicitly filtered:
Or compared to ranges:
Date range filters
Dates can be filtered exactly…
Or as ranges…
Or relative to today.
Aggregation and grouping
Aggregates can be requested for numeric columns.
These aggregates can be filtered…
Explicit visualization type requests
Selection of visualization type can be left to Q&A based on the columns selected and the categories of data specified or inferred in the model (in this case, geographical)…
Or the type can be named explicitly.
In case you’re interested in playing with this sample model yourself, it can be downloaded below.
That’s quite a bit of “free” functionality for little to no work on your part, and for certain raw data imports, Q&A does a surprisingly good job of answering user questions. However, there are a number of ways this can go wrong. Next week in Part 2, we’ll step you through the most common problems, and show how you can fix them and improve your Power View experience at the same time.