Beyond my general disclaimer, these are personal notes that may be out of date. Read generously.
When someone spins the wheel, which of the following would you consider to be an "Error"?
Most people will divide this list in the following way:
Valid:
Errors:
There's another way you might model this, however:
Valid:
Errors:
The idea here is that spin-wheel
returns a SpinWheelResult
instead of a Wedge
, which could then have WheelLogicViolation
as a return value, making WEAK_SPIN
a proper value instead of an error.
The core idea here is that nothing is inherently an error. Rather, we choose to model things certain ways because they modularize our concepts well - we like to think of a Wedge
as the output of spinning the wheel because it results in a conceptually concise "happy-path", but this is purely an artifact of what is an ergonomic way of thinking about things.
There is, however, a more formal way of thinking about what an error is.
A function f: Domain -> Codomain
is simply a mapping from elements of a Domain to elements of a Codomain.
A total function is simply one that is defined for all elements of the Domain, in contrast to a partial function, which is undefined for certain inputs.
The division function /: (R,R) -> R
is undefined for the denominator value of 0
, and is therefore a partial function.
If we had instead defined it as /: (R,R\{0}) -> R
, this is now a total function.
What are some other examples?
As we can see from the division example, Totality is mostly a modeling property by how you choose to describe your function - in a very real sense, both of those functions are division and are the division that we're all familiar with.
The question of whether or not the function is "total" is simply a matter of how you want to describe its possible inputs and outputs.
The fact that partial functions exhibit the same kind of "modeling arbitrariness" as errors is not a coincidence. Indeed, everything that we think of as an Error can be formalized as a cause of Partiality - put another way, it's Errors that make total functions partial.
The division function that accepts 0
has a DivideByZero
error. The function that asks a database to look up a particular value has a NetworkTimeout
error when the internet connection gets cut.
The "pure" set of concerns revolves around the input values that are passed into your functions, of which there are typically two sources of error:
The "pure" set of concerns revolves around the input values that are passed into your functions, of which there are typically two sources of error:
The "pure" set of concerns revolves around the input values that are passed into your functions, of which there are typically two sources of error:
The "pure" set of concerns revolves around the input values that are passed into your functions, of which there are typically two sources of error:
In today's appendix, we'll discuss Dependent Typing, which is a type system powerful enough to eliminate the divide-by-0 problem, which raises an interesting question: Is it possible to eliminate the second class of errors?
The "pure" set of concerns revolves around the input values that are passed into your functions, of which there are typically two sources of error:
In today's appendix, we'll discuss Dependent Typing, which is a type system powerful enough to eliminate the divide-by-0 problem, which raises an interesting question: Is it possible to eliminate the second class of errors?
It's actually a consequence of the Halting Problem that type systems cannot be powerful enough to do that.
The second source of Partiality is of fundamental importance, because computing must run on real world machines, which has two main sources:
OutOfMemory
) that can cause potentially any function to fail.Let's say you've got a database, and you have to query a table to fetch a particular record by its ID. What are the possible errors? How would you classify each of them by the type of Partiality they introduce?
Before concluding, let's do a quick survey of the different kinds of error representations that are commonly used, and discuss the kinds of Partiality they tend to represent.
Before concluding, let's do a quick survey of the different kinds of error representations that are commonly used, and discuss the kinds of Partiality they tend to represent.
null
, false
, -1
, etc.Before concluding, let's do a quick survey of the different kinds of error representations that are commonly used, and discuss the kinds of Partiality they tend to represent.
null
, false
, -1
, etc.val, err := fn()
as in golangBefore concluding, let's do a quick survey of the different kinds of error representations that are commonly used, and discuss the kinds of Partiality they tend to represent.
null
, false
, -1
, etc.val, err := fn()
as in golangthrow Exception
or golang's panic()
Before concluding, let's do a quick survey of the different kinds of error representations that are commonly used, and discuss the kinds of Partiality they tend to represent.
null
, false
, -1
, etc.val, err := fn()
as in golangthrow Exception
or golang's panic()
Try
, Haskell's Maybe
, etc.Before concluding, let's do a quick survey of the different kinds of error representations that are commonly used, and discuss the kinds of Partiality they tend to represent.
null
, false
, -1
, etc.val, err := fn()
as in golangthrow Exception
or golang's panic()
Try
, Haskell's Maybe
, etc.When you run through a bunch of different use cases, you start noticing that there doesn't seem to be much of a rhyme or reason for people using a particular error representation for a particular use cases - there's a lot of inconsistency and personal preference/familiarity dictating these choices.
As we established previously, it's not possible for a type-system to be strong enough to eliminate the possibility of Pure Partiality errors, but it's nonetheless interesting to explore more powerful type systems as they can create tighter bounds on the domains of your functions, which can reduce the number of Pure Partiality error cases you must handle.
safe_div : (x : Int) -> (y : Int) -> {auto p : so (y /= 0)} -> Intsafe_div x y = div x y
For every function you write, anything for which an output value isn't defined for a particular input, or an IO error causing incorrect termination, is a cause of partiality and should be modeled as an error.
The essence of robust software is not to eliminate errors since errors are simply branches of other behavior due to how we have chosen to model our functions.
The essence of robust software is not to eliminate errors since errors are simply branches of other behavior due to how we have chosen to model our functions.
Rather, it is to exhaustively handle those branches in an intentional fashion.
The essence of robust software is not to eliminate errors since errors are simply branches of other behavior due to how we have chosen to model our functions.
Rather, it is to exhaustively handle those branches in an intentional fashion.
Error handling mechanisms exist in order to help you convert Partial functions into "Total" functions.
The essence of robust software is not to eliminate errors since errors are simply branches of other behavior due to how we have chosen to model our functions.
Rather, it is to exhaustively handle those branches in an intentional fashion.
Error handling mechanisms exist in order to help you convert Partial functions into "Total" functions.
Let's consider:
public static Rational divide(Rational numerator, Rational denominator) throws ArithmeticException
this signature is essentially
/: (Q,Q) -> Rational || ArithmeticException
which is Total!
I call it my billion-dollar mistake. It was the invention of the null reference in 1965. Tony Hoare
This is the act of returning a value that acts as the "sentinel" and represents an error (such as null
or -1
).
I call it my billion-dollar mistake. It was the invention of the null reference in 1965. Tony Hoare
This is the act of returning a value that acts as the "sentinel" and represents an error (such as null
or -1
).
-1
means different things depending on where the error occurred), which means that your handling must be maximally localized - invoke the function and handle the error immediately before passing the return value anywhere else (ideally before even binding it to a variable).$username = $_GET['username'] ?? 'not passed';
The second mechanism we'll consider is the Exception. As a separate value with explicit control flow implications, they were considered an improvement over sentinel values since they made it very difficult for people to silently and accidentally ignore an error.
The second mechanism we'll consider is the Exception. As a separate value with explicit control flow implications, they were considered an improvement over sentinel values since they made it very difficult for people to silently and accidentally ignore an error.
MySystemException()
)The last model we'll consider are the error monads. These are the Option
, Maybe
, Try
, and Result
s of the world.
They are containers, which means that you'll have to crack them open (and handle errors) in order to get at the real value, but it also means that the errors are all well-contextualized, making it possible to pass them around as values safely.
The last model we'll consider are the error monads. These are the Option
, Maybe
, Try
, and Result
s of the world.
They are containers, which means that you'll have to crack them open (and handle errors) in order to get at the real value, but it also means that the errors are all well-contextualized, making it possible to pass them around as values safely.
if/else
statements - methods like .isPresent()
or .get()
should generally be avoided. Methods like .filter()
or .getOrElse()
are generally preferable..map()
in order to defer accessing the actual value, and use .flatMap()
in order to cleanly chain with other partial functions.Errors are simply other branches to be dealt with, which means error-handling code is still code, which means that Modularity matters.
Errors are simply other branches to be dealt with, which means error-handling code is still code, which means that Modularity matters.
In particular, there are three guiding heuristics that are particularly worth considering for error handling code:
One of the key questions you should be asking yourself when it comes to handling an error is simply: "Who has enough information to actually handle this error properly?"
One of the key questions you should be asking yourself when it comes to handling an error is simply: "Who has enough information to actually handle this error properly?"
Let's consider a case:
If you have a platform that abstracts away integrations with many different partners, how would you use the Authority heuristic to decide which errors the platform should encapsulate and which it should bubble up?
An interesting consequence of the Authority Heuristic is in how to deal with Physical Layer failures.
The reality is that these are inevitable and unhandleable - your code cannot remediate failures in the physical machine it runs on - and as such, they are not worth modeling. Process Teardown should be how we proceed, and recovery must be delegated to the meta-system (such as a daemon or a human).
This then means that all code should be written as if it could fail and terminate at any moment of time, and that the resulting state should be such that the meta-system's recovery process will result in correct application state.
Another major modularity principle is that of Volatility Risk.
Another major modularity principle is that of Volatility Risk.
Code to handle errors is the same as code to handle the happy path - if you have components that are vulnerable to change, then you'll want to decompose your code along those boundaries, so that changes do not risk contaminating the rest of your code.
Another major modularity principle is that of Volatility Risk.
Code to handle errors is the same as code to handle the happy path - if you have components that are vulnerable to change, then you'll want to decompose your code along those boundaries, so that changes do not risk contaminating the rest of your code.
A great example of this is database IO.
The vast majority of server-side logic allows database-specific exceptions or errors to leak out of the persistence layer adapter. This means that anyone in any other part of the codebase could catch (SQLException e)
and write logic that is now dependent on the specific database you are using.
If you were to ever try and change the database you were using, you'd now have invisible dependencies all over the codebase that would be very difficult to tease out.
Arbitrariness allows us to model our functions however we want, but we should look to the idea of eliminating branches of behavior in order to help us choose less complexity among these many options.
John Ousterhout has described this idea as "define errors out of existence".
Arbitrariness allows us to model our functions however we want, but we should look to the idea of eliminating branches of behavior in order to help us choose less complexity among these many options.
John Ousterhout has described this idea as "define errors out of existence".
Consider his example, the unset
function that removes a variable.
What should this function do if the input variable already doesn't exist?
I want to leave you with two heuristics to regularly use:
If you consistently follow these ideas and continually seek ways to improve the expressiveness and clarity of your code, the rest will follow.
Beyond my general disclaimer, these are personal notes that may be out of date. Read generously.
Keyboard shortcuts
↑, ←, Pg Up, k | Go to previous slide |
↓, →, Pg Dn, Space, j | Go to next slide |
Home | Go to first slide |
End | Go to last slide |
Number + Return | Go to specific slide |
b / m / f | Toggle blackout / mirrored / fullscreen mode |
c | Clone slideshow |
p | Toggle presenter mode |
t | Restart the presentation timer |
?, h | Toggle this help |
Esc | Back to slideshow |