What is ASPerfCounters.JSON?

In a previous blog post, I have talked about the various diagnostic data that Power BI Desktop has available for troubleshooting issues.  One of the interesting files generated is ASPerfCounters.JSON.

I originally thought the name came from  ASP.NET performance counters but I am guessing that it actually has its roots with Analysis Services.

I was curious to see what data was contained in this file so I used Power BI Desktop to open and parse the file.  Power BI Desktop has a JSON file connector so it was relatively simple to crack the file open and see what was inside.  I was pretty surprised to see that the file actually had PERFMON data from the time that the file was generated to an hour earlier.  The data is collected every 60 secs for the following performance counters:

PagedMemorySize64inBytes & PeakPagedMemorySize64inBytes

VirtualMemorySize64inBytes & PeakVirtualMemorySize64inBytes

WorkingSet64InBytes and PeakWorkingSet64inBytes

PrivateMemorySize64inBytes

SystemAvailableDiskSpaceInBytes

SystemAvailableMemoryInMB

SystemCPUUsagePercentage

TotalProcessorTimeInMillisecs

To load the data into Power BI Desktop, I did the following.

  1. Use the process in the earlier blog post to generate a Frown Snapshot File.  Open the zip file generated and copy out the the ASPerfcounters.json file that has been generated.
  2. Use the JSON File connector to ‘Get Data’ and open the ASPerfCounters.json file.

3.   The Power Query Editor will open and it will show a list of records.  *Note – the following steps are how I imported and cleaned the file, no doubt there are other ways that it can be done as well.*

4.  First we need to convert the list to table.  If you right click the list column, you will see an option called ‘To Table’.

Select ‘To Table’ and you will get a confirmation dialogue asking for delimiters and how to handle extra columns.

I left everything as default as shown and selected ‘Ok’.

5.  At this point, your list has been converted to a table.  I selected the expand icon and selected ‘ok’ to the dialogue that appeared:

6.  Once expanded, Column1.p is yet another list that needs to be expanded.  When you select the expand icon, you get the option to expand to new rows.  Select that and will now see that Column1.p contains records that need to be expanded.  Expand Column1.p once more time and select both the key and value.  You will most likely want to ensure that ‘Use original column name as prefix’ is unchecked.

7.  At this point, you have all of your data expanded and accessible but there are some more things to be done to get at least a basic set of useful data.

First, delete columns ‘Column1.i’, ‘Column1.n’ and ‘column1.u’.  Nothing useful there.

Next you will want to pivot the key column.  Select the key column and choose ‘Pivot Column’ from the Transform section of the ribbon as shown:

When the dialogue pops up, select the default which should have the ‘Value’ column chosen as the ‘Values Column’.

This will display the data in a nice familiar format, date in the first column and columns for each performance counter.  Of course, the data looks pretty odd:

The date is displayed in what is apparently a native JSON date format.  I had to go hunting on the Power BI Forum to find out how to handle this format.  Luckily, Greg Deckler had this one in the bag.  The date is stored as milliseconds from 1/1/1970.  He lists a way to convert it using a calculated column but I wanted to convert it in power query so I used the following steps.

I selected the date column, then chose ‘Extract’ from the Transform section of the ribbon.  I chose ‘Text Between Delimiters’ and used the open and close parentheses as delimiters as shown.

Now I was dealing with a column that was in milliseconds so I set about parsing out the date and time.  I first renamed ‘Column1.t’ to ‘DateMill’, converted that column and all of the perf counter columns to actual numbers and then to get the current date, I added a custom column with the following code:

Date.AddDays(#date(1970,1,1), Number.IntegerDivide([DateMill], 86400000))

To get the time value in GMT, I created the following custom column:

= Table.AddColumn(#”Added Custom”, “Custom.1”, each Time.From(([DateMill]/86400000) – Number.IntegerDivide([DateMill],86400000)))

At this point, you can spend more time cleaning up the data and charting it as you wish:

 

So, why is this useful?  Well, its cool that you can use Power BI Desktop to parse JSON files.  This file might prove useful if you have a large Power BI model that takes hours to refresh.  In case of a crash, you can look at this file and see what was going on with your desktop performance.  You could also use these metrics to review performance during a refresh in case Perfmon is not your cup of tea.

 

Copy Session Diagnostics to Clipboard

Another quick hitter on Power BI Diagnostics.

If you have your model open in Power BI Desktop, you can get a quick dump of all of your Power Query code by doing the following:

  1.  From the top ribbon, Choose ‘Help, then ‘About’

2.  This will launch a popup window.  Choose ‘Copy diagnostics to clipboard’.

3.  Open Notepad (or Word, or OneNote, etc.) and paste.  All of your PowerQuery M code will be included in the information that is copied.

4.  Note also that when you do this, you generate a ‘Frown Snapshot’ file in your diagnostic folder.  By default, this snaphshot will have some ASPerfCounter data and ‘FlightRecorderCurrent.trc’, which is your Power BI ‘black box’ recorder.

Nifty eh?

Getting started with troubleshooting a slow Power BI report page

Performance always is an issue, isn’t it?  Throughout a career that wandered through IIS (My web pages are slow), SharePoint (My web parts/lists/search queries/indexes are slow), SQL Reporting (My SSRS reports/stored procedures/excel pages are slow) and now Power BI (My visuals are taking forever to load), troubleshooting slow performance is always a big part of what I do.

Troubleshooting Power BI visuals can be a little tricky.  There aren’t any obvious dials or gauges to look at, you can’t spin up perfmon and attach it to Power BI desktop and the logs, while impressive looking, won’t help you narrow in on the poorly written measure that is killing your performance.  What I am going to layout next is a quick approach that you can take to not only get a good look at the performance of a report page but how also you can narrow in on the measures that are dragging you down.

Quick Setup Note – I am using the customer profitability sample from Microsoft for my PBIX file.  Its visuals load super quickly but its a quick and easy download here.

First things first, your reports and data model need to be in the same PBIX.  We will be using Dax Studio to connect to the data model and run a trace so everything we are testing needs to be in the same PBIX.  If you have your visuals and data model in separate PBIX files, you will need to recreate your visuals in the PBIX where your data model live.

STEP ONE:

Create a blank report page.  Power BI desktop will load visuals on the report page that it opens when you open the PBIX so in order to capture a true idea of the page performance, you need to create a blank report page and save the PBIX with that page active.

STEP TWO:

Close and reopen your PBIX file.  If you did step one, you should be looking at a blank report page.

STEP THREE: 

Open Dax Studio and the ‘Connect’ screen should open.  Select your open PBIX dpcument as shown and select ‘connect‘.

STEP FOUR:

Once connected, click on the ‘All Queries’ button in the ribbon.  This actually starts a trace on your SSAS instance that is running in Power BI desktop.  Once the trace is ready, you will see ‘Query Trace Started’ in the output window.

STEP FIVE:

Return to your PBIX that you have open in Power BI desktop.  Click on the report tab that you wish to trace and let the page fully load.  Once the page loads, you can stop the trace by returning to DAX Studio, choosing the ‘All Queries’ tab and selecting the stop button.

STEP SIX:

Once the trace is stopped, click on the duration column header to sort the queries by duration.  As  I mentioned earlier, this demo is super fast so the ‘slowest’ query took 21ms but hopefully you get the point.  You know have a list of queries that were performed to build your page, along with the time it took to execute each of the queries.

STEP SEVEN:

Continuing on, double click on the query text in the ‘Query’ column.  The actual code used will show up in the editor section above the output section.  Now you can analyze the DAX being called as well as run an individual trace to dig in deeper.

DIGGING DEEPER:

At this point, you can run all of the DAX in the editor or you can highlight and run just sections of it, just like normal in DAX Studio.  If you enable the Query Plan and Server timings options, you can capture a trace and see the actual queries that are being passed to the formula and storage engine for processing.  Enabling the query plan option does just what it says, it gives you the query plans, both physical and logical, that were chosen to run the queries.

I have a long animated GIF below that shows turning on the query plan and server timings options, setting the ‘Run’ option to flush the cache each time I run a query, then running the query.  I then show where you can find the query plan and server timings information.  Since I got the whole screen in the GIF, its a pretty lousy resolution but perhaps if you open it in another tab, you can see enough detail.

Questions, comments, suggestions on digging deeper into Power BI visuals/reports performance?  Throw me a comment or hit me up on Twitter – @szBigDan.

A little bit about Power BI Diagnostics

Gilbert Quevauvillie from FourMoo posted a great blog post identifying all of the different processes that are running when you have Power BI running.

He identifies four different processes that you will see running and a quick blurb on each.  I wanted to post a quick follow up to show how the diagnostic logging in Power BI captures output from these processes.

Power BI Desktop has diagnostic logs?  Absolutely!

To turn on the logging, enable it from your options as follows:

There is a link that you can click on that will take you directly to the traces folder in case you aren’t the type to memorize logging locations for software.

When you open this folder, \%username%\AppData\Local\Microsoft\Power BI Desktop\Traces, there are actually two sets of logs.  The first set is stored directly in this folder and they will only be created once you turn on tracing.

If you open the Performance folder that is in this directory, you will see that Power BI Desktop actually is ALWAYS logging.  I am fairly certain that these are a rolling set of logs that are captured in case of an unexpected failure.  When Power BI crashes, it will create a FrownSnapShot zip file and a PerformanceTraces.zip file.  The latter is a zip of the Performance folder.  The FrownSnapShot zip file is pretty cool because it contains the latest SQL Flight Recorder data that has been captured.  Interestingly enough, the FrownSnapshot zip also includes a file called ASPPerfCounters.json which looks like Perfmon data that is dumped into a JSON file.  Interesting……

The log files that you will find in the Traces folder are as follows.

  1.  PBIDesktop.<version>.TimeStamp
    • This file logs information for the Power BI Desktop Application.
  2.  Microsoft.Mashup.Container.NetFX45.<version>.<TimeStamp>
    • This log appears once you go into the Query Editor or refresh your data.  This is where Power Query actions are logged during the processing and cleaning of your data.
    • Once you start a data refresh in Power BI Desktop, you will see multiple log files that start with this name.  I am going to assume that Power BI Desktop will spin up multiple Power Query engines as necessary and each will have its own log file.
  3.  MSMDSRV.<version>.TimeStamp
    • From my experience, you will only see this log if an exception is captured by the SSAS engine in Power BI Desktop.  For instance, I saw this log appear when I was troubleshooting a crash that was occurring during data refresh.  This log is also included in the FrownSnapshot zip that is created during a crash.

If you have questions about Power BI Logging or want to add corrections, please leave me a comment!

 

 

Where did my arrows go?

Quick hitter on this one.  I pushed some reports to the Power BI Service lately and noticed that suddenly I didn’t have the ability to drill down into my matrix visualizations.  I could right click a row in the visualization and get the option to drill down but I am used to being able to hover over and see the following arrows pop up:

It’s a quick easy fix to get these back.  Once you have the March 2018 release of Power BI Desktop, you have the ability to turn off the visual header.  This means that while the report is in reading mode, the headers won’t appear when you mouse over the visualization.

Sure enough, when I went into the Reports Section of the workspace and checked the settings, I saw this:

Not sure how that got toggled on for my reports but since I had built a bunch of them from the same template, it’s easy to see how they all that setting applied.  Toggle that switch and its back to normal.

Simple fix.

The anatomy of a DAX mistake – Analyzing DAX Performance

I have been working with a retail customer that uses a 445 date table.  Matt Allington over at PowerPivotPro wrote a great blog post on what a 445 date table is and how to create one but creating one wasn’t my issue because the customer provided me with the one that they use (Whew).

Instead, I was doing some custom time intelligence with DAX since I couldn’t use the built in functions due to the whacky date table.  Specifically, the customer wanted a lot of their metrics expressed in terms of last week, the last four weeks, the last 13 wks and total.

As a side note here, the interesting thing about DAX is that there are usually all sorts of ways to do the same thing.  I am not giving a template for how to solve this problem with the following approach, I am just using what I did to show you how I found my first (hah, hah) performance problem and solved it.

I started by defining a simple measure that would give me the latest date for the data that I had in my data set.

CURRENT WEEK ID = MAX ( ‘Date'[DateID] )

Since my Date table only has data for weeks that I have data in my fact table, this gives me the max Date ID for my dataset.  With this information in hand, I created some custom helper tables in my model.  The example below is the one created for the previous thirteen weeks helper table.

Rolling_13_Date =
VAR maxID = [CURRENT WEEK ID]
RETURN
FILTER (
‘Date’,
‘Date'[DateID]
>= ( maxID – 12 )
&& ‘Date'[DateID] <= ( maxID )
)

I made a helper table for each of the custom time periods, 13wk, 4wk and 1wk.  These tables are created dynamically when the data model is refreshed and they are joined back to the main data table as shown:

So the thinking was this, to make life easy for myself, any time that I had to present a measure over one of these time periods, I would just do the following:

My4WkMeasure = Calculate([MyMeasure], Rolling_4_Date)

Initial testing went great, it worked as expected and it was super easy to remember this approach.

Of course, I did mention that I when I am doing development, it is against a week of data right?  Yeh, probably should have mentioned that.  Once I pulled in all of the data, here is a sample of the blazing fast speed that my measure was cranking out:

 

 

That is the ‘Server Timings’ readout from Dax Studio.  My measure, just one single measure using this approach, took 4.5 seconds to load.  For such a simple measure (basically total sales), it actually would have taken almost 15 seconds if it didn’t spread the love out over multiple cores.  Even better, its not a lot of data being materialized, the time is all CPU time so I am making the Query Engine thing WAY too hard for such a simple calculation.

When I started digging into the query, I quickly saw my mistake.  I don’t quite understand the full details of it because I am not Marco Russo or Alberto Ferrari, but I could see enough from the queries shown in Dax Studio to understand what I did wrong.  By specifying an entire table as a filter instead of a column, the Query Engine performed a massive join that pulled in my entire rollup table, the main date table and the other connected rollup tables following the 1:1 relationships.  All of the columns in the new mashed up table were then used in such a massive WHERE clause that it actually overran the buffer in Dax Studio before it captured all of the text.  Usually DAX studio will cut off the text and tell you how many more lines were remaining but I think I overwhelmed it with my nonsense.  The resulting WHERE clause looked like it was filtering on every available value in all of the date columns in all of the joined date tables.

That is only a short piece of the never ending WHERE clause (yes it goes on and on my friend….).  Wow, I am not doing something right.

To try and limit the JOIN, I changed my measure to the following, this time specifying a column:

My4WkMeasure = Calculate([MyMeasure], Rolling_4_Date[DateID])

That’s more like it!  Total time 48ms.  Again, not exactly sure why its so much faster but if you look at the queries, the where clause is much more simpler.

Ok, wrapping up, what did I learn here?  Always specify a column in your filter expression when using calculate?  Always test your measures against production data?  Always keep Dax Studio handy?

I guess so, these are all good things to know.  More importantly, I got to immediately put to work the minding bending stuff that I am learning in the SQLBI.com course Optimizing DAX.  I am only half way through the course and already am using way more of Dax Studio than I knew even existed before I started.  This is not a paid endorsement, btw, I ponied up out of my own pocket to get a bunch of the videos that these two mad scientists cooked up and they have been worth every penny.

Using DAX Studio, I was able to performance test a measure by itself, quantify how bad a job I did with it, see where the slowness was coming from (yes its the WHERE Clause that doesn’t end…), make a change and immediate quantify how fast the measure was once I fixed it.

I will try to blog more perf scenarios as I come across them but for now, if you don’t know DAX Studio, and you are responsible for DAX/Power BI/SSAS Tabular performance, you need to learn it now.

And in case I didn’t firmly stick Lambchop into your ear with this post, enjoy:

Let it go, let it go, can’t import any more!

Ran into this today.  For once I was being patient and didn’t kill Power BI in its tracks and start over but maybe I should have?

I was doing a full import of my customer’s data set (24+  million rows) and knowing it would take a while, I was doing my normal shiny-object hopping around the Power BI world on the internet.  Somewhere along the line, my import seemed to freeze and the rows imported stopped stopped increasing.

Since I live out in the East Texas Outback, you can bet that it took a while to get those 21 million rows that it got before it hung so I started the stages of grief about this import.

  • Shock
    • Whaaa?
  • Denial
    • This can’t be happening, it’s not happening!
  • Anger
    • Quick route here too btw, dang it Microsoft!
  • Bargaining
    • Ok, let me try killing off all these other processes.  Here computer, please, take more memory and CPU and make it work again, please!
  • Depression
    • This sucks man, I have lost so much time.  Why does this ALWAYS (I have a teenager, sorry) happen to me!  Pout, whine, complain.
  • Testing
    • Task manager says that its still pulling down data, it’s still churning away with the CPU.  Oh well, time to give up and hit cancel, wait, I can’t, it just made a noise like its got a window open somewhere (cue sinking feeling….).  Let me try ALT-TAB and see if there is a hidden notification window or something…….oh rats
    • .
  • Acceptance
    • Time to kill it and start over.

I am not exactly sure how I managed to fill up a drive but it was suspicious that Carbonite was working away in the background during all of this.  Wouldn’t surprise me if Carbonite locked the file that Power BI was trying to update, not sure.

My point is, always check for those hidden windows.  That could have saved me a big chunk of time.

 

 

Implementing slicer driven currency calculations

My customer recently wanted to add international currency support to their Power BI reports.  He gave me an excel file that had conversion rates for the current year and asked that the users be allowed to choose what currency format they wanted to see their data displayed in.

There is no easy ‘out of the box’ way to implement this.  The slicer needs to slice the result of a measure which of course needs to be written generically.  There was a great video on the Curbal YouTube channel that walked through a method of using slicers to change the format of currency values.  I originally used this approach but ran into problems displaying the data in a matrix visualization.  Since we were formatting the data as strings, the data would not sort properly in a matrix because the values were being string sorted instead of being treated as numbers.

Therefore, I went with the following approach that combined her very clever way of using slicers and helper tables with keeping the values as numbers so that we could sort the data properly.

I imported the Excel file as a stand alone table with no relations to any of the existing tables already in the model.  It will exist as a simple lookup table.

Notice that since this is a US company, the raw data is stored in US Dollars so no conversion is needed for USD.

First step is to make a slicer based on the ‘Exchange Rates'[Currency Type].

Any measure that would give a result in currency has the following DAX at the beginning:

RTL SALES TY =
VAR cur =
    IF (
        HASONEVALUE ( ‘Exchange Rates'[Currency Type] ),
        VALUES ( ‘Exchange Rates'[Currency Type] ),
        “USD”
    )
VAR res = –Implement Measure Here
RETURN
    IF ( cur = “USD”, res, res * VALUES ( ‘Exchange Rates'[Exchange Rate] ) )

What happens is that when the measure is being calculated, we first check to see if the Exchange Rates table is being filtered down to one row.  If so,  we store the currency type in that row in the variable cur, otherwise we store the default value of “USD”.  We implement our measure and store the value in a variable called res.  The final line of the measure checks the value of cur and if we have selected a different currency type than “USD”, we multiply the result of the measure by the exchange rate.

The drawback of this approach is that we lost the ability to have a currency symbol associated with our measure results.  Since we don’t know what currency the user is going to select, we can’t type the measure as a specific currency type.  The customer wanted to ensure that users would be able to easily see what type of currency was being displayed so I created a simple measure that would show what currency type was selected.

Selected Currency =
SELECTEDVALUE ( ‘Exchange Rates'[Currency Type], “Multiple Selected” )

SELECTEDVALUE is very similar to HASONEVALUE.  If a slicer is used in a visual to limit the ExchangeRates table to just one row, it will display the Currency Type of that row.  If not, it would display ‘Multiple Selected’.

The following animated GIF is a quick demo of the approach with Worldwide Importers data.