Heatmap profiler slash code coverage tool

2012-08-27

I've been rewriting ZeParser to make it faster and cleaner. The work has progressed quite nicely and I've reached the point where I need to grasp for micro-optimizations. Stuff you'll only notice on huge files. After using the JIT Inspector firefox extension, I wanted more profiling. The built-in profiler of Chrome helped, but only to some extend. The built-in profiler of firebug, once so magnificent, is now utter crap (seriously, it only shows me like 10 results, what's up with that?). After searching a bit for a visual JS profiler, I couldn't really find any that was easy to use. In fact, most of those profilers required IE in one way or the other. Odd. So I built one myself.

Edit 5 Jan 2014: Updated links and instructions (only) in this post to reflect v2
Blog post for v2: here


After about two day of frantically hacking the result is a completely client-side, independent real-time JavaScript heatmap profiler. For JS, in JS. Runs on most browsers (IE9 gives me some unrelated crap I couldn't be arsed to fix). You can benchmark code from within the profiler or you can open another tab, run the code there, and watch the heatmap from a separate tab. This uses local storage. I've also made it so that you can quite easily embed this into an existing project so you can profile that. Oh the profiler also has the ability to show you the code coverage, since that information was available as well (whether you want to know the mostly executed code or the code that wasn't executed, is pretty much the same metric).

Check out the live demo here. For some example integrations, checkout this or this page. Then open the UI in another tab and press the "Listen" tab and go.

The project is open sources on github. You can read more information about it on the readme there.

The remainder of this blog post is about some of the rewrite rules that it uses to do its magic.

The profiler only works with tracking the number of times something is executed, opposed to the time that it takes. Time can be added, of course, but is currently not part of Heatfiler. This tracking is done through "pings". A ping is just a function being called whenever I want to count something. I add these calls to the start of any statement and to all the expression parts. The statements are pretty simple to do, but the expressions are a bit more work. Especially since I also wanted to see how the code branches with &&, ||, and the ternary operator ..?..:... I was looking to eliminate as many comparisons as possible in my parser.

So let's start with statements. Those are quite easy. You can prefix any statement with another statement and as long as the first statement doesn't change anything, you won't notice any change. Don't forget to wrap it in a block though, because otherwise certain statements like if will only execute the first statement and that will break your code :)

Code:
if (foo) bar;

{ ping(n); if (foo) { ping(n); bar; } }

As you can see this will incur a bit of overhead. But don't worry about that, it's only visual. The blocks do nothing else than to group a set of statements. There is no other side effect, so it's perfectly safe to use them and even wrap them a few times. A block in a block in a block, is perfectly fine in JS, and should incur no overhead whatsoever. We will skip ExpressionStatements (see below), but ping for any other statement. Note that function declarations are not statements.

Functions need to be handled slightly different. Prefixing them with a ping will not record their execution. However, you can safely assume that a function will have an opening and closing bracket. To know how many times a function is invoked, simply add a ping immediately after the opening bracket of a function.

In my code, I'm omitting the brackets if I detect a bracket on the left side of the statement I wanted to prefix. This makes the result slightly cleaner but more importantly, it prevents certain issues that arise from ASI's (missing semi's).

Expressions are a different beast. Their implementation was two-fold for me. First I did just an expression as a whole. Then I made the distinction to cut expressions up between logic operators. For expressions we use parenthesis like we used brackets for statements. The parenthesis will prevent an expression from being interpreted as two individual values in a comma delimited list (like a function call or an array literal).

At the start of a statement we will not add a parenthesis though, since that might cause problems with ASI. Instead, we just prefix the ping followed by a comma (to keep it a single expression statement), like this: ping(), .... This is perfectly safe as JS will execute both (any..) values in a comma expression.

For any other expression part, the expression is wrapped between like this: (ping(), ...). The only other danger with parenthesis here is for certain identifiers that should return a so called "Reference". This is for instance the case for operators like `delete` and labels. So you have to watch out for those.

So I was cutting up my expressions in parts that were actually executed, cutting at the conditional paths. This was fine until I encountered a case where it didn't work so great:

Code:
x && y
(ping(), x) && (ping(), y)

x ? y : z
(ping(), x) ? (ping(), y) : (ping(), z)

x = y || z
(ping(), x = y) || (ping(), z)

Can you see why that last one is not the same? That's right, it will now always assign y, rather than y || z. So I fixed that by simply wrapping the remainder of an expression in parenthesis right after any assignment operator.

Code:
x = y || z
(ping(), x = (y) || (ping(), z))

a = x = y || z
(ping(), a = (x = (y) || (ping(), z)))

And now the assignment will flow as before, the parenthesis are pretty much ignored. Adding more parenthesis is not an issue, and won't cause a problem in this case. Note that I'm not adding more pings here, simply because it won't really serve a purpose. The extra pings would be called just as many times as the whole. And the fact that an assignment may not occur again in an expression if it was preceded by almost any other operator (only parens and comma), means we don't have to add extra pings. The parenthesis, to make sure the logic remains the same, is sufficient.

Like I said, take care with labels. In my setup I don't have to worry about delete arguments (they're a single expression that cannot be broken by logic operators) but the break and continue statements do not start as an expression (duh) so my algorithm would want to wrap the labels too. I had to add a rule that prevented this.

Code:
break foo;
{ping(); break foo;

// and not
{ping(); break (ping(), foo);

The final piece to the puzzle is knowing how to track the counters. I mean, if is just a keyword and very likely to occur multiple times in the same script. So how do you track them uniquely? Well, that's quite easy actually. Just use their token position. That'll be unique within the same script. So that's what I'm using.

A simple example script:

Code:
function foo(a,b){
return a>5 && a+b || a-b;
}
for (var i=0; i<20; ++i) foo(i,i*10);

becomes
Code:
function foo(a,b){FOO(0,0);
{ BAR(0,8,20); return (BAR(0,9,11), a>5) && (BAR(0,13,15), a+b) || (BAR(0,17,19), a-b); }
}
{ BAR(0,22,44); for (var i=(BAR(0,27,27), 0); (BAR(0,29,31), i<20); (BAR(0,33,34), ++i)) BAR(0,36,43), foo((BAR(0,38,38), i),(BAR(0,40,42), i*10)); }

So... yeah.

So go play with the demo if you haven't already!

I hope you enjoyed this and I hope the tool helps you :)