際際滷

際際滷Share a Scribd company logo
Power BI: Asking questions the natural way using Q & A
Power BI: Asking questions the natural way 
using Q + A 
Bjoern H Rapp
About me 
Business Developer at Steria Twitter: @bjoern_rapp 
SharePoint MVP and MCT Blog: www.sharepointviking.com 
10 + years in system dev Email: safetox@hotmail.com
AGENDA 
 Power BI Primer 
 Q & A 
 Building Models 
 Synonyms 
 Behind the scenes
My Goals for the sessionnot like this.
but this!! .
POWER BI familiarity of Office and the power of the cloud 
EXCEL 2013 POWER BI for Office 365 
Discover Analyze Visualize Share Find Mobile
Building Models - Modeling Tips 
 Fixing Missing Relationships 
 Renaming Tables and Columns 
 Choose a default Field Set for each table 
 Choose a default Label for each table 
 Fix incorrect data types 
 Mark year and id columns as Do not Summarize 
 Choose a data category for each data and geo col 
 Choose a sort by column for relevant column
Building Models  Missing Relationships 
POOR 
BETTER
Building Models  Renaming tables and columns 
POOR 
BETTER
Building ModelsDefault field set for each table
Building ModelsDefault label for each table
Building Models  Fix incorrect data types
Building Models  Do not summarize id and date columns
Building Models  Choose a Data Category for each date and 
geography column
Building Models  Choose a Sort By Column for relevant 
columns
Building Models  Normalize your model 
POOR 
BETTER
Synonyms 
 Unique for Q + A
1. 
Sharing Questions 
 Sharing links 
 Featured Questions
1. 
Behind the scenes  Choosing the best answer 
Search for interpretations of 
questions 
Score and select best interpretation 
Select best visual
1. 
medal count by country for 2012 
Search for interpretations of 
questions 
Score and select best interpretation 
Select best visual 
from Olympics.xlxs 
show medal[medal count], medal[country] 
Where Olympics[year] = 2012 
from Olympics.xlxs 
show medal[medal count], Olympics[host country] 
where olympics[year] = 2012 
from Sales.xlxs 
show customer[country] 
where order[order year] = 2012
1. 
medal count by country for 2012 
Search for interpretations of 
questions 
Score and select best interpretation 
Select best visual 
from Olympics.xlxs 
show medal[medal count], medal[country] 
Where Olympics[year] = 2012 
from Olympics.xlxs 
show medal[medal count], Olympics[host country] 
where olympics[year] = 2012 
from Sales.xlxs 
show customer[country] 
where order[order year] = 2012 
0.86 
0.72 
0.46
1. 
medal count by country for 2012 
Search for interpretations of 
questions 
Score and select best interpretation 
Select best visual 
from Olympics.xlxs 
show medal[medal count], medal[country] 
Where Olympics[year] = 2012 
from Olympics.xlxs 
show medal[medal count], Olympics[host country] 
where olympics[year] = 2012 
from Sales.xlxs 
show customer[country] 
where order[order year] = 2012 
0.86 
0.72 
0.46
1. 
Influencing the best answer 
Search for interpretations of 
questions 
Score and select best interpretation 
Select best visual 
Data modeling, 
synonyms 
Workbooks in site 
Data modeling (data types, data 
categories)
Power BI: Asking questions the natural way using Q & A
1. 
THANK YOU!

More Related Content

Power BI: Asking questions the natural way using Q & A

  • 2. Power BI: Asking questions the natural way using Q + A Bjoern H Rapp
  • 3. About me Business Developer at Steria Twitter: @bjoern_rapp SharePoint MVP and MCT Blog: www.sharepointviking.com 10 + years in system dev Email: safetox@hotmail.com
  • 4. AGENDA Power BI Primer Q & A Building Models Synonyms Behind the scenes
  • 5. My Goals for the sessionnot like this.
  • 7. POWER BI familiarity of Office and the power of the cloud EXCEL 2013 POWER BI for Office 365 Discover Analyze Visualize Share Find Mobile
  • 8. Building Models - Modeling Tips Fixing Missing Relationships Renaming Tables and Columns Choose a default Field Set for each table Choose a default Label for each table Fix incorrect data types Mark year and id columns as Do not Summarize Choose a data category for each data and geo col Choose a sort by column for relevant column
  • 9. Building Models Missing Relationships POOR BETTER
  • 10. Building Models Renaming tables and columns POOR BETTER
  • 11. Building ModelsDefault field set for each table
  • 13. Building Models Fix incorrect data types
  • 14. Building Models Do not summarize id and date columns
  • 15. Building Models Choose a Data Category for each date and geography column
  • 16. Building Models Choose a Sort By Column for relevant columns
  • 17. Building Models Normalize your model POOR BETTER
  • 18. Synonyms Unique for Q + A
  • 19. 1. Sharing Questions Sharing links Featured Questions
  • 20. 1. Behind the scenes Choosing the best answer Search for interpretations of questions Score and select best interpretation Select best visual
  • 21. 1. medal count by country for 2012 Search for interpretations of questions Score and select best interpretation Select best visual from Olympics.xlxs show medal[medal count], medal[country] Where Olympics[year] = 2012 from Olympics.xlxs show medal[medal count], Olympics[host country] where olympics[year] = 2012 from Sales.xlxs show customer[country] where order[order year] = 2012
  • 22. 1. medal count by country for 2012 Search for interpretations of questions Score and select best interpretation Select best visual from Olympics.xlxs show medal[medal count], medal[country] Where Olympics[year] = 2012 from Olympics.xlxs show medal[medal count], Olympics[host country] where olympics[year] = 2012 from Sales.xlxs show customer[country] where order[order year] = 2012 0.86 0.72 0.46
  • 23. 1. medal count by country for 2012 Search for interpretations of questions Score and select best interpretation Select best visual from Olympics.xlxs show medal[medal count], medal[country] Where Olympics[year] = 2012 from Olympics.xlxs show medal[medal count], Olympics[host country] where olympics[year] = 2012 from Sales.xlxs show customer[country] where order[order year] = 2012 0.86 0.72 0.46
  • 24. 1. Influencing the best answer Search for interpretations of questions Score and select best interpretation Select best visual Data modeling, synonyms Workbooks in site Data modeling (data types, data categories)

Editor's Notes

  • #3: Q & A is probably one of the most interesting and fascinating features that comes with Power BI. Q and A is about making the data as easily accessible as possible to as many users as possible. Q & A is all about data exploration and its based on the underlying Data Model just like Power View and Power Map, two of the authoring tools. So the better the data model is, the more sophisticated questions you can ask and the more relevant answers you will get. 1. How do I find the right data that answer my questions at this specific moment of time There are lots of different data sources, so how do you know which ones to look at. 2. Finding the right answer among a myriad of different prebuilt reports. Sometimes they maybe didnt slice the right way, or maybe you wanted the answer right now, but building the report that gave you the answer would take a while. So Q & A was built to face these problems, and its all data exploration enabling anyone in the organization to discover and find answers to problems , and more importantly finding the answer by asking the questions using a Natural Language, so you dont have to learn a new expression language or new keywords. You just type as you would to any search engine.
  • #8: - Power BI , self service alternative for business users - Quick reports, supplement to the standard reports Based on well known tools like Excel for the authoring Office 365 for the tooling Briefly mention the different tools, point out where Q & A fits in.
  • #10: If you are missing relationships between tables in your model, neither Power View nor Q&A are going to start wildly guessing how to join between those tables if you ask a questions about them. Power Pivot relationships are the cornerstone of a good model. For example, you cannot ask for the total sales for Seattle customers if the relationship between the orders table and the customers table is missing.
  • #11: While poorly named tables and columns make even manual Power View report authoring difficult, choice of names is even more critical for Q&A. For example, if you have a table named CustomerSummary, which contains a list of your customers, you would need to ask questions like List the customer summaries in Chicago rather than List the customers in Chicago. Not exactly intuitive. While Q&A can do some basic word breaking and detection of plurals, it is going to trust that your table and column names accurately reflect the content. Consider also this case: You have a table named Headcount that contains first and last names and employee numbers, and you have another table named Employees contains employee numbers, job numbers and start dates. While this might be decipherable to someone familiar with the model, someone who asks count the employees is going to get a count of the rows from the Employees table, which is probably not what they had in mind, since thats a count of every job each employee has ever had. Better to rename the tables to truly reflect what they actually contain.
  • #12: The Default Field Set is used for selecting the columns to display by default when asking questions about a table. When you ask for orders for last week, you probably dont want to see all thirty columns in your Orders table.
  • #13: The Default Label is used in Power View for selecting items in filters. It is also used by Q&A when deciding what kind of graph or chart to use for displaying your results. For example, count orders by customer will show a tabular result if the Default Label is not set for the customers table, since there are too many columns to display by default. Setting it to the Customer Name column will allow the result to be shown in a bar chart. Note that some tables, such as transaction tables, will only have a Default Field Set, and no Default Label.
  • #14: Imported data can have incorrect data types. In particular, date and number columns that are imported as strings will not be interpreted by Q&A as dates and numbers. You should select the correct data type in Power Pivot.
  • #15: Power View aggressively aggregates numeric columns by default, so questions like total sales by year can sometimes result in a grand total of sales alongside a grand total of years. If you have specific columns where you do not want this behavior, set the Summarize By property on the column to Do Not Summarize. Be on the lookout for year, month, day, and ID columns, as these are the most frequent problems. Other columns that arent sensible to sum, such as age, could also benefit from setting Summarize By to Do Not Summarize or to Average.
  • #16: The Data Category provides additional semantic knowledge about the content of a column beyond its mere data type. For example, an integer column might be marked as a Year or as a Zip Code. This information is used by Q&A in two important ways: For visualization selection and for language biases. First, Q&A uses the Data Category information to help make choices about what kind of visual display to use. For example, it recognizes that columns with date or time Data Categories are typically a good choice for the horizontal axis of a line chart or the play axis of a bubble chart. And it assumes that results containing columns with geographical Data Categories may look good on a map. Second, Q&A makes some educated guesses about how users are likely to talk about date and geography columns, to help it understand certain types of questions. For example, the when in When was John Smith hired? is almost certain to map to a date column, and the Brown in Count customers in Brown is more likely to be a city than a hair color. Note that some data categories, such as Year, will need to be entered as Custom Categories.
  • #17: The Sort By Column property allows sorting on one column to automatically sort by a different column instead. For example, when you ask sort customers by shirt size, you probably want your Shirt Size column to sort by the underlying size number (XS, S, M, L, XL) rather than alphabetically (L, M, S, XL, XS).
  • #18: The general rule of thumb youll want to follow is this: Each unique thing the user talks about should be represented by exactly one model object (table or column). So, if your users talk about customers, there should be one customer object. And if your users talk about sales, there should be a one sales object. Simple, right? Depending on the shape of the data youre starting with, it can be. There are rich data shaping capabilities available in Power Query if you need them, while many of the more straightforward transformations can happen simply using calculations in Power Pivot. Here are some of the more common transformations you might need to perform: Create new tables for multi-column entities If you have multiple columns that act as a single distinct unit within a larger table, those columns should be split out into their own table. For example, if you have a Contact Name, Contact Title, and Contact Phone column within your Companies table, a better design would be to have a separate Contacts table to contain the Name, Title, and Phone and a link back to the Companies table. That makes it significantly easier ask questions about contacts independently of questions about companies for which they are the contact and improves display flexibility (e.g. Default Field Set and Default Label).
  • #19: This is the only step which is applicable specifically to Q&A rather than Power View in general. Users often have a variety of terms they use to refer to the same thing, e.g. total sales, net sales, total net sales. Power Pivot allows these synonyms to be added to tables and columns within the model. Do not underestimate the importance of this step. As obvious as you think your table and column names might be, the users of Q&A are asking questions using the vocabulary that first pops into their heads, not picking from a predefined list of columns. The more sensible synonyms you can add, the better. A word of warning, however: Be aware that youre introducing ambiguity when you add the same synonym to more than one column or table. Q&A will utilize context, where possible, to choose between ambiguous synonyms, but not all questions have sufficient context. For example, when your user asks count the customers, if you have three things with the synonym customer in your model, they might not get the answer they are looking for. In these cases, make sure the primary synonym is unique, as that is what is used in the restatement. It can alert the user to the ambiguity (e.g. a restatement of show the number of archived customer records), hinting they might want to ask it differently. Excel auto-generates synonyms for you Excel auto-generates one synonym for each table, column, and field name. This is referred to as the Primary Synonym and appears bolded. If the name is recognized by Excel, that name is used as the synonym. For example, the Primary Synonym for Cumulative Medal Count is cumulative medal count, and Description is description. If the name is a concatenation of words recognized by Excel, the name will be split into those individual words. For example, the Primary Synonym for HostID is host id, and Country/Region is country region. If the name is not recognized by Excel, it is used as-is. For example, the Primary Synonym for NOC is noc, and ccrx123 is ccrx123. Understanding the Synonyms pane Which data model objects can have synonyms? Not every object from the data model will appear in the Synonyms pane. Calculated fields can have synonyms, but hierarchies cannot. If you find this confusing, you can turn off the display of objects for which synonyms cannot be created. To do this, in the Display area at the top of the Diagram, deselect one or more objects. In the Synonyms pane, you will notice that some are bolded and some are not. The bolded synonyms are the ones auto-generated by Power Pivot and are called Primary Synonyms. The lighter synonyms are those manually created. The auto-generated synonym can be deleted and replaced.
  • #21: Looking at all the workbooks you added for Q & A, and find all that can answer the question Score: Finding the best interpretation of the ones found Choosing the right visual, like table or chart, or map, so whrn you have the right answer how do you choose that particular visual
  • #22: Hypothetical example, you may have 3 different workbooks in your Q & A that may answer the question Show me the medal country and the country where the medals came from Ambiguity, Country also appears in question where the Olympics were held First step is finding all interpretations, and then Q & A puts scores on the different questions in order to rank them 2 important ones: Coverage of the words, Semantic distance ( small parts is better or compact interpretations than traversing all your model)
  • #23: First one, every word in the question was used, semantic distance was close Had to use an other table Sales scored poorly because all words were not used
  • #24: Visualization, ran through set of rules, to determine how to best visualize it Show 1 measure, 1 member (geo) -> map Show 2 measures, 1 member -> scatter Show 1 measure, 1 member -> bar If it was 1 measure and 1 member was geographic, it will show a map Other poteentials Go through different interpretations until You can influence this How the data is categorized in Power Pivot will influence it