Application Insights and Power BI – Expand dynamic customDimensions JSON object into new columns automatically

Posted by & filed under , , .

(Note: If you’re already familiar with importing Application Insights data into Power BI, and just want to skip to the solution for dynamically expanding the customDimensions JSON object into columns, skip to Solution 2)

When using Application Insights, there are a number of additional attributes you might wish to track. One common example when tracking SharePoint analytics is to include the User’s name in the data, there’s great examples of how to add this tracking in the official documentation. Basically any custom tracking metrics you want to use need to be passed back in a JSON object under the “customDimensions” parameter.

appInsights.trackPageView(
    document.title,
    window.location.href, 
    {   // customDimensions
        User: personProperties.get_displayName()
    }
);

This works well, and you’re able to query against the values in the Application Insights Analytics query tool using “tostring(customDimensions.User)”  as the column name, for example:

pageViews 
| where timestamp > ago(7d)
| summarize count() by tostring(customDimensions.User)
| top 100 by count_ desc nulls last

This query will give you the top 100 most active users (by page views) from the last week, with a summary of how many page views each user had.

That’s great, but a more common use case for this analytics data is to load it into Power BI where we can build reports with visualizations and embed anywhere we want using an iframe. The process for exporting a query into Power BI is well documented.

Personally I prefer to load the entire set of data into Power BI so that I’m using the same set of records for all of my visualizations. This has the advantage of ensuring that any interactivity you wish to set up (for example, through slicers or drilldowns) will work for any of your visualizations, since all visualizations have the same record fields, and are therefore able to be “sliced” along any field on the record.

This means the query I export from Application Insights Analytics to Power BI is the incredibly simple:

pageViews
| where timestamp > ago(30d)

This query provides all of the pageView data that is available in Application Insights within the last 30 days of recorded data. If you had a very large site you could certainly filter this down further before exporting, trimming it so that it only goes back X number of days would certainly be a great way to slim the data down quickly. For most purposes going back further than 1 month on your Analytics Dashboard is likely unnecessary.

Problem 1 – customDimensions JSON data inaccessible

Once you get the data into Power BI, you’ll notice that the customDimensions column is just stored as a raw string, and the JSON data inside it isn’t readily accessible. Luckily, there is a nice feature built into Power BI for expanding JSON fields.

Solution 1 – Manually expanding customDimensions JSON object into additional fields

  1. Open the Power Query Editor for your query
  2. Right click on the “customDimensions” column header
  3. Navigate to Transform > JSON

  4. This converts the JSON column into Record objects, but we’d actually prefer if each property was it’s own column, luckily there’s a convenient UI button to do just that, in the right side of the “customDimensions” header, there is now an “expand” icon button, click that button:

  5. This brings up a dialog to map these inner properties to new columns in our table:


    Personally I like to leave “Use original column name as prefix” checked, so that all the new columns have “customDimensions.” as a prefix, letting me know that this is custom data my code is responsible for, but you can uncheck it to create the new columns without this prefix if you’d prefer.

  6. Click Ok to accept this list and generate your new columns

Now the values from your customDimensions JSON object are mapped into new columns, and are ready to use just like any other column.

Problem 2 – New customDimensions properties are not expanded automatically

That works great if you set up your custom JSON data once and never change it, but what happens if you start adding new properties to the customDimensions object? The good news is, all of your existing expanded columns are still present and working, the bad news? The new data in the customDimensions object is now lost in the shuffle, the properties you’ve expanded are correctly mapped, but the new data is just thrown away because there isn’t any mapping defined for it.

You can repeat the steps above to expand the JSON object into columns once more, but you’ll first need to remove the previous mapping you put in place.

  • Note that you’ll need at least one entry returned from Application Insights that has your new properties in the customDimensions object. I usually verify with the Application Insights Analytics query tool that I have new entries with the updated customDimansions object before I update anything in Power BI
  • Note that these steps are not the recommended way to handle this, please read the “Solution 2” section below for the proposed correct way, these are just example steps to demonstrate the hassle of manually expanding the JSON object when it gets updated

NOT RECOMMENDED, see Solution 2 for recommended approach:

  1. Open Power Query Editor for your query
  2. In the right column locate the “Applied Steps” section:

  3. Remove the Parsed JSON and Expanded customDimensions steps by clicking the X next to those steps
  4. Select the step that was just above these two steps
  5. Click the “Refresh Preview” button in the ribbon to ensure you have the updated data
  6. Perform the steps from “Solution 1” above, this time expanding all of your new JSON properties into columns as well. As long as you select all of your previously expanded columns, and make sure you use the same “Use original column name as prefix” setting, the resulting column names should match the previous ones, which will leave all of your visualizations correctly wired up.
  • Note that if you have additional steps lower down in the list, you’ll receive a warning that “Inserting an intermediate step” can be dangerous, one potential way to avoid this would be to create a new query each time you had to make these changes, but that would also require wiring up all your visualizations to use the new query as well

All in all, re-expanding your JSON object into columns manually is really simple to do once, but after that the overhead and risk of repeating the manual expansion quickly becomes too much of a headache to be realistic.

Solution 2 – Dynamically expanding customDimensions into new columns automatically

Personally I really don’t like the idea of having to reconfigure the query every time I want to add a piece of data to the customDimensions, especially if it requires “inserting” a step in the middle of my query configuration. Analytics dashboards tend to be used by pretty important stakeholders, and I really hate the idea of having to so thoroughly rewire something that’s already working, just so I can access my new piece of data.

The solution is to update the query using the “Advanced Editor” and write some more dynamic expansion of the customDimensions object. I have to give tons of credit to Mike Honey who has a gist function for expanding all record columns recursively, which in turn was based on Chris Webb’s blog post with a similar function for expanding all columns in a table.

Since all of my data is coming from Application Insights, I know that all of my custom data is only going to be stored in the customDimensions object. This means we don’t actually need their powerful generic functions for expanding it, we can just copy some of their logic directly into our query.

If we start by looking at our query at the end of Solution 1, it gives us a good idea of what’s going on:

  1. Open Power Query Editor for your query
  2. Click Advanced Editor in the ribbon
  3. Examine the bottom of the query, this section contains the code that expands the JSON object into the new fields, but as we can see, the values are hardcoded based on the fields that were present at the time we did the expansion:

  4. If we examine the documentation on Table.ExpandRecordColumn, we can see that the method call breaks down like this:

  5. We want to keep the first two parameters as-is, it’s just the second two that we want to change to be constructed dynamically, so if we look at Mike Honey’s gist, we can see how he’s constructing these two lists and duplicate it in our code.
    This version will include the “customDimensions.” prefix on all the new columns names (my preferred approach), you can also easily change this prefix by editing the “customDimensions” string on line 3 of this snippet:
        #"Parsed JSON" = Table.TransformColumns(AnalyticsQuery,{{"customDimensions", Json.Document}}),
        #"ColumnsToExpand" = List.Union(List.Transform(#"Parsed JSON"[customDimensions], each Record.FieldNames(_))),
        #"NewColumnNames" = List.Transform(#"ColumnsToExpand" , each "customDimensions" & "." & _),
        #"Expanded customDimensions" = Table.ExpandRecordColumn(#"Parsed JSON", "customDimensions", #"ColumnsToExpand", #"NewColumnNames")

    This version leaves off the prefix, and just creates the new columns using the property names from the JSON object:

        #"Parsed JSON" = Table.TransformColumns(AnalyticsQuery,{{"customDimensions", Json.Document}}),
        #"ColumnsToExpand" = List.Union(List.Transform(#"Parsed JSON"[customDimensions], each Record.FieldNames(_))),
        #"Expanded customDimensions" = Table.ExpandRecordColumn(#"Parsed JSON", "customDimensions", #"ColumnsToExpand")
  6. Edit the query in the Advanced Editor by replacing the two existing lines #”Parsed JSON”… and #”Expanded customDimensions”… with your preferred snippet above (highlighted section represents newly pasted text)

  7. Click Done to save the updated query
  8. Review the query preview data in the Power Query Editor to verify that the properties in your customDimensions object have been expanded into their expected columns

And that’s all there is! Now that the query itself is written to expand the JSON object into new fields automatically, you don’t need to open the Power Query Editor at all when you add new properties to customDimensions. Once a record comes in that has the new JSON property on it, a new field will be created for the new property automatically, and you can start creating new visualizations that use this property.

Keep in mind that in most cases for any visualizations that use this new value, you’ll want to add a filter to remove records that don’t have a value in this column. Otherwise all of your historical data from before you added the new property will be included, likely as the largest chunk of data until it gets “pushed out” by newer data using the updated customDimensions object.