pvdz.ee - Finally no more

Preval is a library I've been working on for a few years that attempts to eliminate and simplify code through static analysis. A big hurdle for me was the finally keyword. In fact, I've ignored try for a long time, too long really, and it ended up as tech debt that was much harder to tackle after all the other bits are in place. Generators and async functions are are also in that queue but that's not the topic of this post.

You can find Preval on github: github.com/pvdz/preval
There is also a REPL (consider it wip though): pvdz.github.io/preval/web/

Ref Tracking

Last year I was working on Preval for a while when I realized that ref tracking was broken. I don't remember how exactly but there were some cases I could come up with that clearly proved how broken it was, so I had to rewrite it.

Ref tracking, at least in Preval, is basically a form of code flow analysis where I try to figure out which reads and writes from/to variables can "see" other reads and writes of that variable. In this context, a read can "see" a write when it may observe the value being written to in at least one potential logic branch.

Code:

let x = 1;
log(x); // Can see x=1
if ($) {
  log(x); // Can see x=1
  x = 2;
  log(x); // Can no longer see x=1, but can see x=2
}
log(x); // Can potentially see x=1 and x=2

In the above case, there are two "writes" (the let def and the assignment) and a bunch of "reads". The decl can be "seen" by all-but-one read, while the assignment can be seen by two reads. That's because after you assign to a variable, the reads that are guaranteed to follow are guaranteed to see the value of that last write. After a logic branch, the read can see either the last writes inside each branch, or the value was it was before the branch if it doesn't overwrite in all branches. That's shown above.

In loops this is more awkward because you have to special case the loop re-entry. I won't go into it here but it does require some work.

Finally tracking

In try/catch/finally this is even worse for multiple reasons:

- Code can enter the catch/finally blocks from any point that can throw
    - Note that we can prove that certain statements can not throw at runtime, like a break for example
- Code always goes to the finally block, regardless
- The three blocks are not nested but side by side, which affects scoping and is the only exception on JS when it comes to code flow
    - Consider that if/else never enter both blocks and loops have one block. Nesting is always preserved.
    - Labels are actually syntactically required to have proper nesting

So a finally block can be entered from virtually any point in the try block (recursively!) and any read or write inside the finally can therefor see every read or write inside the try AND catch blocks as well as the value before the try block.

Okay, maybe not every, but trust me when I say this doesn't make it easier.

Ref tracking v2.1.2.3.final.2.bugged

I had to fix ref tracking. Preval would be unreasonably broken if I didn't. But it's a daunting task. I had postponed it a few times, even had a few month hiatus after burning out on failing to solve the problem (started playing chess instead). And then I came back to it.

I got inspired again and figured out a way forward. By tracking and caching and magic I found a way forward that would allow me to have proper ref tracking across try/catch/finally!

And when I had pretty much implemented all of it ... I realized, why not just drop it. Hmmmmmmmm. Dang. Ok.

Finally finally

To transform the finally we first separate them out. All things considered, a try {} catch {} finally is really identical to try { try {} catch {} } finally {}. You can't detect the difference in semantics, other than different stack traces of course.

This simplifies our transform life a little bit because when eliminating the finally now, we only need to worry about the try block, and no longer care about edge cases around the optional catch block.

On top of that, it means we eliminate one special case for try in Preval, namely the case where it has both blocks. After separating them, we can assert that all try blocks only have one or the other but never both.

Code:

try { } catch { } finally {}

// ->

try {
  try {} catch {}
} finally {}

This simplifies logic and edge cases. And the continuation after a catch is always going to enter the finally block regardless. And when a catch is outer-trapped by a finally, that's not a concern for the catch or even the try inside of it.

This transform is simple, doesn't add meaningful overhead, and it simplifies the rules Preval has to deal with. It is a perfect example of how I'd ideally simplify all language features. But ... can we push it further?

Finally gone

So now that we're down to two cases of occurrences of try (it now either has a catch or a finally but never both), what would it take to eliminate, to transform, the finally keyword out of the input code? Of course, in such a way that the semantics are kept.

Doing this would mean that all try occurrences only ever have a catch to deal with! That's a big deal so we're allowing some room for overhead here.

Okay okay. So what does a finally really do?

- Its block is guaranteed to be executed after the try no matter how they complete
- It's able to change the "completion" of the try block as long as the continuation point is outside of the try
- This means doing a try { return } finally { A: break A; } does not change anything, while a A: try { return; } finally { break A; } prevents the return entirely (yes, really). Key here: only when the continuation is outside of the try.
- The try and finally blocks have a local scope which is kind of relevant in case they introduce new vars (but since all vars are uniquely (re)named in Preval that's not a concern for us)
- Implicit crashes are also trapped by the finally so that's a concern to keep in mind

In code, this is kind of what that looks like:

Code:

log(1);
try { log(2); }
finally { log(3); }
log(4);

//->

log(1);
log(2);
log(3); 
log(4);

Except, of course, it's trickier than that. What if the log(2) call crashes? The finally would still need to (try to) execute. So the above is incorrect:

Code:

log(1);
try { log(2); }
finally { log(3); }
log(4);

//->

log(1);
try { log(2); }
catch { }
log(3); 
log(4);

Better? Well, yes, and no. Because if the log(2) crashes then it would not continue after the try... Hrm.

Code:

log(1);
try { log(2); }
finally { log(3); }
log(4);

//->

log(1);
let crashed = false;
try { log(2); }
catch { crashed = true; }
log(3);
if (crashed) throw new Error;
log(4);

Now better? Yes, but there's on more thing left. The error value that was thrown should be preserved. After all, that's an important semantic for debugging if nothing else.

Code:

log(1);
try { log(2); }
finally { log(3); }
log(4);

//->

log(1);
let crashed = false;
let err = undefined;
try { log(2); }
catch (e) { err = e; crashed = true; }
log(3);
if (crashed) throw err;
log(4);

That's it! Yes, the origin of the throw is mangled but keep in mind that we transform the input source quite drastically so that is not as much of a concern to me.

This is how we can rewrite the finally block for a try that doesn't explicitly complete (return etc).

Trying completions

Now what about abrupt completions in the try? Like a break or a return? We've already covered the case where it happens in the finally so we focus here on the try block.

They're basically not a big deal either way but some care does need to be taken here in the transform.

When inside the try We have to find all occurrences of break, continue, return, and throw and add some tracking logic to ensure we can override or retain the continuation.

In its basic form that looks like this:

Code:

function f(){
  try {
    if (a) return 1;
    if (b) throw 2;
  } finally {
    return 3;
  }
}
log(f());

//->

let f = function () {
  debugger;
  {
    let $implicitThrow = false;
    let $finalStep = false;
    let $finalStep$1 = false;
    let $finalCatchArg = undefined;
    let $finalArg = undefined;
    let $finalArg$1 = undefined;
    $finally: {
      try {
        if (a) {
          $finalStep = true;
          $finalArg = 1;
          break $finally;
        }
        if (b) {
          $finalStep$1 = true;
          $finalArg$1 = 2;
          break $finally;
        }
      } catch ($finalImplicit) {
        $implicitThrow = true;
        $finalCatchArg = $finalImplicit;
      }
    }
    {
      return 3;
    }
    if ($implicitThrow) throw $finalCatchArg;
    else if ($finalStep) return $finalArg;
    else if ($finalStep$1) throw $finalArg$1;
    else {
    }
  }
};
log(f());

This will work well. The above transform is what Preval does (it will end up eliminating the redundant label). It remembers the early completions and adds them at the end. This logic is harder to analyze than syntactical keywords but that's an trade-of versus the complexity that finally logic would add.

As you can see above, the `return 2` makes all the other completions dead code and in a subsequent step it'll all be eliminated. In fact, in its current state, this is the result after running Preval on it:

Code:

log(3);

There's some semantics around implicit global exceptions but that's topic for another time.

Worth it

So this sort of transform is absolutely possible. Now the question remains: is that overhead worth the elimination?

Applying this transform means I can completely remove the keyword from my normalized subset. And all of its strange rules. On top of that it melted many complications for the ref tracking alone. There were similar simplifications in other parts of Preval.

As such I would say yes.

Furthermore, many parts of the transform are eliminated later as it either becomes dead code or simple code to inline. A true usage of finally probably gets out a little worse in terms of the analyze step. But on a whole I think Preval is better off without it.