Use a Power BI Tile in your web page

EDIT 2016-07-12: Microsoft has changed their API and policies and functionality, so this artcile is obsolete now. I will shut down the sample site. For details, please see: https://powerbi.microsoft.com/en-us/blog/what-s-new-and-what-s-next-for-power-bi-embedded-july-2016/

 

Creating dashboards is more witchcraft than art and more art than science. Think about how many times you have banged your head to produce a useful chart according to the specs of your client. You have the numbers, you have the chart and everything in place but the customer keeps saying: “it’s just not the right shade of green…”

Many times we have to create custom web pages to accomplish what the BI solution can’t offer out of the box; and Power BI provides you just that, the ability to use a tile from a dashboard without forcing the customer to open the entire Power BI user interface. With this technique you can embed just a piece of a bigger BI report into an HTML page. It’s not very straightforward (at least for me it wasn’t), and I must say that the documentation often makes it worse (there’s an errata in the GET REST Uri of the Step 3 – Get user’s tile information).

So you want to embed a Power BI Tile into a web page: to quickly see how it works go to this example page. If you want more details, just read on.
I assume you already have a Power BI dashboard and you know how to get an access token. This is probably the most difficult part and I covered that in another article.

I am using cURL from a headless EC2 server just for sake of simplicity, you can do this with your favorite programming language.

First step is to get the ID of your dashboard:

[code]

curl -k -X GET "https://api.powerbi.com/beta/myorg/dashboards" -H "$AUTH_HEADER_LOCAL"

[/code]

where $AUTH_HEADER_LOCAL is the “Authorization: Bearer eyJ0eXAiOiJKV1QiLCJhbGciOiJSUzI1NiI……..” you created when asking for an access token.

The server will respond something like this:
recorte

From here you need to extract the id element which represents your dashboard.

Second step is to get the tile ID:

[code]

curl -k -X GET https://api.powerbi.com/beta/myorg/dashboards/${DASHBOARD_ID}/tiles -H "$AUTH_HEADER_LOCAL"

[/code]

where $AUTH_HEADER_LOCAL is the same as above and ${DASHBOARD_ID} is the value you got in step 1.

recorte
the answer will include an element named “embedUrl” which is the link to your tile.

Now that you have the embedUrl and the access token, you can build a simple HTML page:

  • Create an <iframe> element that will host your tile
  • Have your access token ready into a variable, I personally have it in an external .js file that I load as a <script> inside the page
  • Add a <script> element to activate the <iframe> with the access token and the dimensions of the tile

[code language=”javascript”]
<script type="text/javascript">
var width = 800;
var height = 600;
var embedTileUrl = "https://app.powerbi.com/embed?dashboardId=1d555f40-…..2618e&tileId=8cd8bc5….7624a";

window.onload = function () {
updateEmbedTile();
};

function updateEmbedTile() {
if ("" === embedTileUrl)
return;
iframe = document.getElementById(‘iFrameEmbedTile’);
iframe.src = embedTileUrl + "&width=" + width + "&height=" + height;
iframe.onload = postActionLoadTile;
}

function postActionLoadTile() {
if ("" === accessToken)
return;
var h = height;
var w = width;
var m = { action: "loadTile", accessToken: accessToken, height: h, width: w };
message = JSON.stringify(m);
iframe = document.getElementById(‘iFrameEmbedTile’);
iframe.contentWindow.postMessage(message, "*");
}

</script>
[/code]

Note that the access token will expire after 1 hour, so you need to have a way to refresh it on the server side.

Feel free to use http://powerbitile.azurewebsites.net/ as a reference.

Pin an Excel chart/pivot table to a Power BI dashboard

According to this post on the official Power BI blog it is now possible to bring your Excel file into a Power BI dataset as a whole (not only importing data); and use portions of it as tiles on your dashboard.

Wow: you can (re)use your carefully crafted charts and amazingly engineered pivot tables in a web-shareable software as a service, not bad.

This is a very interesting feature and adds a range of new possibilities for this recent Business Intelligence Saas.

Now, few considerations.

  • To accomplish this you need to link an Excel file from OneDrive and the pin-to-the-dashboard option is only available to OneDrive for Business subscribers. So no free beers, sorry.
  • To pin a chart you actually need to select a range of cells where the chart resides and it will be pinned as an image, much like a print screen, so do not dream of interaction with the rest of the selectors/filters on the dashboard. See what I mean in this picture:
    recorte
  • Your chart/table/range will appear in the dashboard as a <div> with a base-64 encoded data:image/png background:
    recorte

I did not try to embed this tile in a web page, but it could be the next logical step to do.

So, overall, a very desirable addition. Prices for OneDrive for Business start at 5USD per user/month, and if you are a no-profit/government/university you can have it even chaper.

 

Twitter Trending Topics by Location, Date (PowerBi WordCloud)

Twitter Trending Topics by Date, Location

One of the things I like the most about Power BI is the ability to use custom visuals. I did not try building one myself and honestly I think it’s beyond my skills, yet there is a gallery with a growing number of pre-build viz that you can download and use.

Go to https://app.powerbi.com/visuals?WT.mc_id=Blog_Visuals and take a look.

I downloaded the WordCloud plus the Chiclet Slicer to create a report showing the trending topics in Twitter sliced by date and location.

Only few locations are available from Twitter API, here is a sample of spanish cities I am monitoring.

Happy tweets!

Microsoft Azure SQL Data Warehouse. How much does it cost?

As many of you know, Microsoft has open to public preview its latest BI technology: Azure SQL Data Warehouse.
On the documentation page is defined as: “an enterprise-class distributed database capable of processing petabyte volumes of relational and non-relational data.”
How much does it cost, is it worth it, and how does it differ from previous SQL offering on Azure?
The main difference -in my opinion- is the billing of compute power and storage:
in Azure SQL DW you decide how fast your query run and how big your data is. This generates a charge for DWU (Data Warehouse Unit), a unit of measure defined as the level of performance of your data warehouse; and a charge for RA-GRS Page Blobs (Read-Access Geographically Redundant Storage).
I suppose this will cause more than a headache in financial departments and SQL administrators, but it is indeed a good thing.

The main point I see here is that -wisely used- the DWU pricing can help tune the database responsiveness and number-crunching speed according to the time of the day (or day of week). I can set for example 100DWU on low traffic hours or weekends, and scale up to 2000DWU Monday morning or during the reporting season when everybody want to access data. I can even choose to pause the database -for example during holidays- and incur in no compute  charge at all. It would be extremely useful to have PowerShell scripts to switch from a DWU level to another and schedule them so that the database can auto-adjusts the compute power needed.

Doing a very simple math: assuming a 30 days month, at 100DWU we will receive a $504 (USD) bill for computing + $62.5 (USD) for 1TB of data, a total $566.5* for a fully functioning SQL Data Warehouse. On the other end of the spectrum, 2000DWU for 1 month would be $10,080(USD)*.

Additionally to the Grow, shrink, or pause capabilities, Azure SQL Data Warehouse features an MPP architecture, column-store indexes, and the ability to query relational and non relational (read Hadoop) with Polybase, more on this here.

So my .02: let’s give it a try…

*During the preview period the prices are discounted, I do not expect them to vary that much anyway, but certainly at the moment it is not clear how much the discount is and how much will the price be after the preview, see here for more details: https://azure.microsoft.com/en-us/pricing/details/sql-data-warehouse/

The Lahman's Baseball Database – real life sample data

Mr Lahman is (by his own words) a Watchdog reporter & data journalist, and an author.
But the collective wisdom of the Business Intelligence world reminds him as the creator of a wonderful public dataset: the Baseball Database.

The Lahman database is extremely useful and I am grateful to his author for several reasons:

  1. It is not a bike/sport/equipment shop sales database, that almost all examples in the BI world use (my book is not an exception…)
  2. It is real life data with an interesting subject, and frequently updated.
  3. And it has errors (very few), those little nice imperfections that come very handy when you are testing/training/developing a test case or a proof of concept.
I don’t remember who told that PowerPoint is always right and nothing can go wrong with a slide presentation -as long as the projector is working and the usb drive is plugged in- but he surely would agree that it’s no fun to demo with a database that is perfectly crafted to return exact results.
I never found in my life a “squeaky clean” production db. As a general rule there is always a primary key missing, an unenforceable foreign key, or missing records, outdated documentation, you name it…that’s why I like the Lahman db, because I can experiment -under a controlled environment- how the software behave in situations that are beyond the standard Show and Tell “look-ma-this-works” case scenarios.
It is available for free and you can download it in several formats from here:
I converted it to Oracle and SQL Server, and you can download those version too from here:
In Oracle, simply run IMPDP with the SCHEMAS=LAHMAN option, the password for the LAHMAN schema is (guess) LAHMAN.
In SQL Server copy the MDF into the DATA folder and attach it to an existing SQL Engine instance.
Hope this helps, I will often use this data to showcase MicroStrategy and other products in my next posts.

See-through MicroStrategy (playing with transparencies)

One thing that I like about MicroStrategy is the ability to create Web documents. Nowadays self-service discovery technologies (being based on Flash or HTML5) tend to constrain a little the “creative” side of the data analysis: trading off the ease of use for a quite limited set of tools to picture your data. Boundaries are build and it is difficult to bend the rules.

Web documents do not have such restrictions and -in fact- are almost limitless in possibilities. Think about the HTML container that allows us to insert custom HTML/JavaScript, or the image container.

I believe that Business Intelligence is about information as much as about design, and -surely enough- a nice looking dashboard is always preferable to a boring dull one.

A fundamental aspect of modern day web pages is the CSS Opacity / Transparency, so why not using it in a dashboard to spicy it up? The components and widget that come with MicroStrategy allow a very limited use of “No Fill” in the background: you can create a transparent Chart or Panel, but not a transparent rectangle. So I simply created a series of white PNGs with different alpha values that I can insert on top of a background picture to simulate a see-through rectangle.

Then I downloaded a background picture on Flickr (careful with the CC license…) and removed the main panel of a blank dashboard.

You can see the result here, where a Web document turns to A Whiter Shade Of Pale.

now you see me now you dont

You can download the transparent pngs here.

Average of Averages… no, seriuosly.

Tom, Bill and Sandra have one euro each; they go and buy some fruits.

Tom  buys 3 apples for 1€. Bill buys 2 oranges for 1€ and Sandra buys a coconut for 1€.
Tom says “apples cost 0,33 euro cents on average”.
Bill “oranges are 0,50 euro cents on average”
Sandra “coconuts are 1 euro on average”

What is the overall average price of fruits ?

I have seen more than once calculating this as:

0.33 + 0.50 + 1 = 1.83
1.83 / 3 = 0.61

Now, you are a Business Intelligence professional: imagine your customer asking you to calculate an average, such as the price per product grouped by product type and the total overall price per product at company level.
Next he dig into his laptop and proudly shows off his “treasure” Excel sheet that does it all, and by magic copy and paste computes the AVG() of the AVG()…

Unfortunately for your client, acrobatic math is not yet an Olympic discipline.

Tom, Bill and Sandra soon realize that they bought 3 + 2 + 1 = 6 pieces of fruit and they spent 3€, so a single piece is 0,50 euro cents.

Similarly your customer (after an intense discussion) will realize that for the last 15 years he used to calculate a price per product average “from a different point of view”. You may want to ease the pain by saying that it is a quite common error and he’s in good company: see Simpson’s paradox and the Berkeley gender bias case

In MicroStrategy terms this is what is called a “Smart Metric”.
smartmetric

When in the Metric Editor, switch to the Subtotals / Aggregation tab, down to the left there’s a check box that you can enable to surprise your customer… cool.

How to Lie with Statistics (by Darrell Huff)

I just finished reading this 140 paperback about statistics, and I found it enlightening:
How to Lie with Statistics: Written by Darrell Huff, 1993 Edition, (Reissue) Publisher: W. W. Norton & Company [Paperback]

It was written back in 1954, but the topics it covers are timeless, and -not surprisingly- Mr Huff has few very useful tips applicable to nowadays business intelligence. You can only tell that the text is 60 years old when looking at the details: in one example there is the price of a good hotel room in San Francisco (an eye popping figure: 8$).

One of the first Chapters explains the “biased sample” problem, which affects so many scientific results; another one points the finger to an old trick used to convey distorted messages with line graphs. Many of the wrongdoings he exposes, sounds hard to believe, are still in use today, and you can find plenty of them in ads, magazines, and (sadly) in government polls. Not earlier than one month ago, one very famous cable news network was showing a chart picturing an “impressive” improvement in Spanish unemployment rate: from 26% down to 22% (ouch!). It was something like this:

spunemp2

What they failed to show was data from preceding years:

spunemp1

data source: Spain Unemployment Rate 1976-2015

One of my favorite part is when Mr Huff shows how unions or entrepreneurs give inconsistent (to say the least) results from the same set of data. Or how “experts” sum percentages in an mind bending attempt to rewrite the math rules: “buy any 20 products, wait one year, see their price increase by 5%, so overall the cost of life (20 x 5% = 100%) has doubled in one year…”.

Oh, one last point: don’t forget to read the exhilarating story about the rabbit burger…

There OAuth to be a better way (Power BI)

Edit: The base URL has changed from https://api.powerbi.com/beta/myorg to https://api.powerbi.com/v1.0/myorg

OAuth is clearly becoming the de-facto standard for authenticating API calls around the web. In the business intelligence arena, while we can discuss its pros and cons, we undoubtedly must get acquainted with it because sooner or later we’ll be tasked with importing data from one of the many “OAuth protected” web services.

In this third release of the OAuth series I’ll show how to get an Access Token from Microsoft Power BI. The same procedure can be used for many other Azure services (by changing the appropriate scope in the resource parameter).

Our typical scenario would be an unattended server process downloading data; I’m using a headless Linux box with cURL and jq.

You can see here and here my previous posts about how to use cURL to authenticate with flickr or BigQuery and make API calls with an Access Token.

The way Azure works is a little different. You get an Access Token valid for 1 hour and a Refresh Token. You can reuse the Access Token for as many calls as you want during the hour, and then you’ll need to ask for a new Access Token presenting the Refresh Token.

Prerequisites:

  • An Azure subscription with a real work domain (no personal account)

loginazure

  • An Azure Active Directory so you can add users to @yourdomain

azuread

  • A Power BI subscription with a user belonging to your Azure Active Directory

azure users

Once you have the requisites in place, follow this article to create an app and get a Client ID.

clientid

With the username, password and the Client ID you can use this script to get an Access Token:

[code language=”bash” gutter=”true” light=”false”]
#!/bin/bash

OAUTH_CLIENT_ID="yourClientId"
OAUTH_USERNAME="youruser@yourdomain.com"
OAUTH_PASSWORD="yourpassword"

POST_RESULT="$(curl -s -X POST -d "resource=https://analysis.windows.net/powerbi/api&client_id="$OAUTH_CLIENT_ID"&grant_type=password&username="$OAUTH_USERNAME"&password="$OAUTH_PASSWORD"&scope=openid" "https://login.windows.net/common/oauth2/token" | jq -r .)"

REFRESH_TOKEN="$(echo ${POST_RESULT} | /usr/local/bin/jq -r .refresh_token)"
ACCESS_TOKEN="$(echo ${POST_RESULT} | /usr/local/bin/jq -r .access_token)"
AUTH_HEADER="Authorization: Bearer ${ACCESS_TOKEN}"
echo "${AUTH_HEADER}"

echo "${AUTH_HEADER}" &gt; ./auth_header.txt
echo "${REFRESH_TOKEN}" &gt; ./refresh_token.txt

[/code]

This script will save two files: one is the Authorization Header and the other is the Refresh Token. You will use the Authorization Header passing it to every API call that you make (during 1 hour), for example to get a list of the available datasets in your Power BI storage use:

[code language=”bash” gutter=”true” light=”false”]
#!/bin/bash

AUTH_HEADER=$(&lt;./auth_header.txt)
curl -k -s "https://api.powerbi.com/beta/myorg/datasets" -H "$AUTH_HEADER" | /usr/local/bin/jq -r .

[/code]

After an hour or so, you will ask for a new Access Token and store the new Authorization Header (can also be crontabbed every nn minutes):

[code language=”bash” gutter=”true” light=”false”]
#!/bin/bash

REFRESH_TOKEN=$(&lt;./refresh_token.txt)
OAUTH_CLIENT_ID="yourClientId"
OAUTH_USERNAME="youruser@yourdomain.com"
OAUTH_PASSWORD="yourpassword"

POST_RESULT="$(curl -k -s -X POST -d "resource=https://analysis.windows.net/powerbi/api&client_id="$OAUTH_CLIENT_ID"&grant_type=refresh_token&username="$OAUTH_USERNAME"&password="$OAUTH_PASSWORD"&scope=openid&refresh_token=${REFRESH_TOKEN}" "https://login.windows.net/common/oauth2/token" | jq -r .)"

REFRESH_TOKEN="$(echo ${POST_RESULT} | /usr/local/bin/jq -r .refresh_token)"
ACCESS_TOKEN="$(echo ${POST_RESULT} | /usr/local/bin/jq -r .access_token)"

AUTH_HEADER="Authorization: Bearer ${ACCESS_TOKEN}"

echo "${AUTH_HEADER}"
echo "${AUTH_HEADER}" &gt; ./auth_header.txt
echo "${REFRESH_TOKEN}" &gt; ./refresh_token.txt

[/code]

So, no browser, no GUI, no problem!