Saturday, June 27, 2020

Checked exceptions break composition

A.K.A. Always Throw Runtime Exceptions or their subclasses

A typical Java or C++ function can potentially come with an exception specification: for example, a method can declare that it throws exceptions of a particular type (eg. IOException, std::bad_alloc etc.) and clients need to handle that exception being thrown with a try-catch block. This seems good at the outset till we spend time thinking through what this does to the type of the function.

A typical function in a happy-go-lucky world either succeeds or fails with an exception because of something beyond its control. If it succeeds, it returns with a value of the provided return type (let's call it SuccessValueType). If it fails with an exception (eg. a file read error or a mem allocation error), it throws the exception and the error handling parts of the code run. In type terms, the return type of the function is Either<SuccessValueType, RuntimeExceptionType> (where the RuntimeExceptionType is an implicit return type of the function). If all functions agree that RuntimeExceptionType is the implicit secondary return type, functions and try-catch blocks compose beautifully. This is because every function call site becomes an implicit early return point with a valid return value from the function. As a corollary, every try-catch block wrapping the function also makes little to no assumptions of the kinds of exceptions its likely to receive and that builds in flexibility for code evolution.

Here's an example:
Function 1 => calls => Function 2 followed by Function 3; both Function 2 and 3 can only throw RuntimeExceptions
If either of these functions throws, the RuntimeException is propagated as an "early return" from Function 1 without any changes. You can stack as many layers of nesting in the code and the return types and early return behavior remain compatible (because all the functions agree that RuntimeExceptionType is an implicit return type).
The application then adds error handling code close to the top-level of the processing hierarchy and presents the error to the user (as a form of recovery) or retries or notifies an engineer to take a look. If we need to add additional context to the exception, at any level a try-catch block can be introduced to attach context information to the exception and rethrowing the RuntimeException. This introduction of an intermediate try-catch is a purely local change that composes well with try-catch blocks further up the stack (removing a try-catch similarly composes well). Adding new libraries or call paths to the code remains a purely local operation and does not affect the type hierarchy or the error-handling try-catch structure.

Contrast with what happens when a checked exception is introduced. The function type changes from bi-valent to tri-valent: Either<SuccessValueType, RuntimeExceptionType, CheckedExceptionType>. Note that avoiding the RuntimeExceptionType is not possible (else you'll have code littered with redundant bad_alloc, io_exceptions and the like that are meaningless). With a trivalent return type from a function, we have 2 options: 

1. Convert the function back into the bi-valent return type by introducing a try-catch block, catching the checked exception and rethrowing as a RuntimeException.
2. Propagate the checked exception and ask our clients to update their code.

(1) is of-course the reasonable thing to do. It's a local operation, client code doesn't have to change and we're back to having to deal with only a single type of failure (either the function succeeds or the function fails with a RuntimeException).
(2) is a world of pain. If we're in this world, every new introduction of a checked exception means that significant chunks of the program have to change to include the new checked exception type.

Going back to our original example:
If Function 1 => calls => Function 2 followed by Function 3 and both of them throw checked exceptions of different types, the return type of Function 1 becomes Either<SuccessValueType, RuntimeExceptionType, F2CheckedExceptionType, F3CheckedExceptionType> (essentially the union of all the checked exceptions show up in the return type signature). As we keep adding more nested functions, this type list keeps expanding. 

In practical terms, this means that the developer adds "throws F2CheckedExceptionType, F3CheckedExceptionType, ..." to each of the caller functions in order to get them to compose. All the try-catch blocks similarly bloat to handle all the possible failure cases. Beyond small-sized codebases, this is completely infeasible because these signature changes and the try-catch handlers keep propagating out throughout the codebase. This hurts dev velocity.

From a recovery perspective, these checked exceptions are typically handled just one-level up the call stack at the lowest level of library code (to avoid the exception signature blowout) and a local resolution is done (retry a few times and then fail). This is typically not an optimal solution (eg. for an out-of-disk-space error, a batch processing application might prefer an immediate crash, a streaming application might prefer a continuous retry but without propagating the error all the way up to the application, this choice of recovery can't be made reasonably and the only way out is to pass down configurations to control this behavior... a gargantuan mess). 

In the RuntimeException only world, the retry configuration stays at the top level where things can be handled based on the execution environment.

In summary, as a practical matter, professional software engineers should ensure that their functions only throw unchecked exceptions (RuntimeExceptions or similar). Checked exceptions are actively harmful to dev velocity in large codebases and should be avoided. Google avoids this tar pit by banning exceptions from C++ code (for historical reasons), LinkedIn & Pinterest actively utilize a RuntimeException based Java codebase and you should encourage this too.