Sun, 30 Apr 2017
What do you mean ExceptT doesn't Compose?
Disclaimer: I work at Ambiata (our Github presence) probably the biggest Haskell shop in the southern hemisphere. Although I mention some of Ambiata's coding practices, in this blog post I am speaking for myself and not for Ambiata. However, the way I'm using ExceptT and handling exceptions in this post is something I learned from my colleagues at Ambiata.
At work, I've been spending some time tracking down exceptions in some of our Haskell code that have been bubbling up to the top level an killing a complex multi-threaded program. On Friday I posted a somewhat flippant comment to Google Plus:
Using exceptions for control flow is the root of many evils in software.
Lennart Kolmodin who I remember from my very earliest days of using Haskell in 2008 and who I met for the first time at ICFP in Copenhagen in 2011 responded:
Yet what to do if you want composable code? Currently I have
type Rpc a = ExceptT RpcError IO a
which is terrible
But what do we mean by "composable"? I like the wikipedia definition:
Composability is a system design principle that deals with the inter-relationships of components. A highly composable system provides recombinant components that can be selected and assembled in various combinations to satisfy specific user requirements.
The ensuing discussion, which also included Sean Leather, suggested that these two experienced Haskellers were not aware that with the help of some combinator functions, ExceptT composes very nicely and results in more readable and more reliable code.
At Ambiata, our coding guidelines strongly discourage the use of partial functions. Since the type signature of a function doesn't include information about the exceptions it might throw, the use of exceptions is strongly discouraged. When using library functions that may throw exceptions, we try to catch those exceptions as close as possible to their source and turn them into errors that are explicit in the type signatures of the code we write. Finally, we avoid using String to hold errors. Instead we construct data types to carry error messages and render functions to convert them to Text.
In order to properly demonstrate the ideas, I've written some demo code and made it available in this GitHub repo. It compiles and even runs (providing you give it the required number of command line arguments) and hopefully does a good job demonstrating how the bits fit together.
So lets look at the naive version of a program that doesn't do any exception handling at all.
import Data.ByteString.Char8 (readFile, writeFile) import Naive.Cat (Cat, parseCat) import Naive.Db (Result, processWithDb, renderResult, withDatabaseConnection) import Naive.Dog (Dog, parseDog) import Prelude hiding (readFile, writeFile) import System.Environment (getArgs) import System.Exit (exitFailure) main :: IO () main = do args <- getArgs case args of [inFile1, infile2, outFile] -> processFiles inFile1 infile2 outFile _ -> putStrLn "Expected three file names." >> exitFailure readCatFile :: FilePath -> IO Cat readCatFile fpath = do putStrLn "Reading Cat file." parseCat <$> readFile fpath readDogFile :: FilePath -> IO Dog readDogFile fpath = do putStrLn "Reading Dog file." parseDog <$> readFile fpath writeResultFile :: FilePath -> Result -> IO () writeResultFile fpath result = do putStrLn "Writing Result file." writeFile fpath $ renderResult result processFiles :: FilePath -> FilePath -> FilePath -> IO () processFiles infile1 infile2 outfile = do cat <- readCatFile infile1 dog <- readDogFile infile2 result <- withDatabaseConnection $ \ db -> processWithDb db cat dog writeResultFile outfile result
Once built as per the instructions in the repo, it can be run with:
dist/build/improved/improved Naive/Cat.hs Naive/Dog.hs /dev/null Reading Cat file 'Naive/Cat.hs' Reading Dog file 'Naive/Dog.hs'. Writing Result file '/dev/null'.
The above code is pretty naive and there is zero indication of what can and cannot fail or how it can fail. Here's a list of some of the obvious failures that may result in an exception being thrown:
- Either of the two readFile calls.
- The writeFile call.
- The parsing functions parseCat and parseDog.
- Opening the database connection.
- The database connection could terminate during the processing stage.
So lets see how the use of the standard Either type, ExceptT from the transformers package and combinators from Gabriel Gonzales' errors package can improve things.
Firstly the types of parseCat and parseDog were ridiculous. Parsers can fail with parse errors, so these should both return an Either type. Just about everything else should be in the ExceptT e IO monad. Lets see what that looks like:
{-# LANGUAGE OverloadedStrings #-} import Control.Exception (SomeException) import Control.Monad.IO.Class (liftIO) import Control.Error (ExceptT, fmapL, fmapLT, handleExceptT , hoistEither, runExceptT) import Data.ByteString.Char8 (readFile, writeFile) import Data.Monoid ((<>)) import Data.Text (Text) import qualified Data.Text as T import qualified Data.Text.IO as T import Improved.Cat (Cat, CatParseError, parseCat, renderCatParseError) import Improved.Db (DbError, Result, processWithDb, renderDbError , renderResult, withDatabaseConnection) import Improved.Dog (Dog, DogParseError, parseDog, renderDogParseError) import Prelude hiding (readFile, writeFile) import System.Environment (getArgs) import System.Exit (exitFailure) data ProcessError = ECat CatParseError | EDog DogParseError | EReadFile FilePath Text | EWriteFile FilePath Text | EDb DbError main :: IO () main = do args <- getArgs case args of [inFile1, infile2, outFile] -> report =<< runExceptT (processFiles inFile1 infile2 outFile) _ -> do putStrLn "Expected three file names, the first two are input, the last output." exitFailure report :: Either ProcessError () -> IO () report (Right _) = pure () report (Left e) = T.putStrLn $ renderProcessError e renderProcessError :: ProcessError -> Text renderProcessError pe = case pe of ECat ec -> renderCatParseError ec EDog ed -> renderDogParseError ed EReadFile fpath msg -> "Error reading '" <> T.pack fpath <> "' : " <> msg EWriteFile fpath msg -> "Error writing '" <> T.pack fpath <> "' : " <> msg EDb dbe -> renderDbError dbe readCatFile :: FilePath -> ExceptT ProcessError IO Cat readCatFile fpath = do liftIO $ putStrLn "Reading Cat file." bs <- handleExceptT handler $ readFile fpath hoistEither . fmapL ECat $ parseCat bs where handler :: SomeException -> ProcessError handler e = EReadFile fpath (T.pack $ show e) readDogFile :: FilePath -> ExceptT ProcessError IO Dog readDogFile fpath = do liftIO $ putStrLn "Reading Dog file." bs <- handleExceptT handler $ readFile fpath hoistEither . fmapL EDog $ parseDog bs where handler :: SomeException -> ProcessError handler e = EReadFile fpath (T.pack $ show e) writeResultFile :: FilePath -> Result -> ExceptT ProcessError IO () writeResultFile fpath result = do liftIO $ putStrLn "Writing Result file." handleExceptT handler . writeFile fpath $ renderResult result where handler :: SomeException -> ProcessError handler e = EWriteFile fpath (T.pack $ show e) processFiles :: FilePath -> FilePath -> FilePath -> ExceptT ProcessError IO () processFiles infile1 infile2 outfile = do cat <- readCatFile infile1 dog <- readDogFile infile2 result <- fmapLT EDb . withDatabaseConnection $ \ db -> processWithDb db cat dog writeResultFile outfile result
The first thing to notice is that changes to the structure of the main processing function processFiles are minor but all errors are now handled explicitly. In addition, all possible exceptions are caught as close as possible to the source and turned into errors that are explicit in the function return types. Sceptical? Try replacing one of the readFile calls with an error call or a throw and see it get caught and turned into an error as specified by the type of the function.
We also see that despite having many different error types (which happens when code is split up into many packages and modules), a constructor for an error type higher in the stack can encapsulate error types lower in the stack. For example, this value of type ProcessError:
EDb (DbError3 ResultError1)
contains a DbError which in turn contains a ResultError. Nesting error types like this aids composition, as does the separation of error rendering (turning an error data type into text to be printed) from printing.
We also see that with the use of combinators like fmapLT, and the nested error types of the previous paragraph, means that ExceptT monad transformers do compose.
Using ExceptT with the combinators from the errors package to catch exceptions as close as possible to their source and converting them to errors has numerous benefits including:
- Errors are explicit in the types of the functions, making the code easier to reason about.
- Its easier to provide better error messages and more context than what is normally provided by the Show instance of most exceptions.
- The programmer spends less time chasing the source of exceptions in large complex code bases.
- More robust code, because the programmer is forced to think about and write code to handle errors instead of error handling being and optional afterthought.
Want to discuss this? Try reddit.
Posted at: 12:22 | Category: CodeHacking/Haskell | Permalink