m3ga blog

Sun, 30 Apr 2017

What do you mean ExceptT doesn't Compose?

Disclaimer: I work at Ambiata (our Github presence) probably the biggest Haskell shop in the southern hemisphere. Although I mention some of Ambiata's coding practices, in this blog post I am speaking for myself and not for Ambiata. However, the way I'm using ExceptT and handling exceptions in this post is something I learned from my colleagues at Ambiata.

At work, I've been spending some time tracking down exceptions in some of our Haskell code that have been bubbling up to the top level an killing a complex multi-threaded program. On Friday I posted a somewhat flippant comment to Google Plus:

Using exceptions for control flow is the root of many evils in software.

Lennart Kolmodin who I remember from my very earliest days of using Haskell in 2008 and who I met for the first time at ICFP in Copenhagen in 2011 responded:

Yet what to do if you want composable code? Currently I have
type Rpc a = ExceptT RpcError IO a
which is terrible

But what do we mean by "composable"? I like the wikipedia definition:

Composability is a system design principle that deals with the inter-relationships of components. A highly composable system provides recombinant components that can be selected and assembled in various combinations to satisfy specific user requirements.

The ensuing discussion, which also included Sean Leather, suggested that these two experienced Haskellers were not aware that with the help of some combinator functions, ExceptT composes very nicely and results in more readable and more reliable code.

At Ambiata, our coding guidelines strongly discourage the use of partial functions. Since the type signature of a function doesn't include information about the exceptions it might throw, the use of exceptions is strongly discouraged. When using library functions that may throw exceptions, we try to catch those exceptions as close as possible to their source and turn them into errors that are explicit in the type signatures of the code we write. Finally, we avoid using String to hold errors. Instead we construct data types to carry error messages and render functions to convert them to Text.

In order to properly demonstrate the ideas, I've written some demo code and made it available in this GitHub repo. It compiles and even runs (providing you give it the required number of command line arguments) and hopefully does a good job demonstrating how the bits fit together.

So lets look at the naive version of a program that doesn't do any exception handling at all.


  import Data.ByteString.Char8 (readFile, writeFile)

  import Naive.Cat (Cat, parseCat)
  import Naive.Db (Result, processWithDb, renderResult, withDatabaseConnection)
  import Naive.Dog (Dog, parseDog)

  import Prelude hiding (readFile, writeFile)

  import System.Environment (getArgs)
  import System.Exit (exitFailure)

  main :: IO ()
  main = do
    args <- getArgs
    case args of
      [inFile1, infile2, outFile] -> processFiles inFile1 infile2 outFile
      _ -> putStrLn "Expected three file names." >> exitFailure

  readCatFile :: FilePath -> IO Cat
  readCatFile fpath = do
    putStrLn "Reading Cat file."
    parseCat <$> readFile fpath

  readDogFile :: FilePath -> IO Dog
  readDogFile fpath = do
    putStrLn "Reading Dog file."
    parseDog <$> readFile fpath

  writeResultFile :: FilePath -> Result -> IO ()
  writeResultFile fpath result = do
    putStrLn "Writing Result file."
    writeFile fpath $ renderResult result

  processFiles :: FilePath -> FilePath -> FilePath -> IO ()
  processFiles infile1 infile2 outfile = do
    cat <- readCatFile infile1
    dog <- readDogFile infile2
    result <- withDatabaseConnection $ \ db ->
                 processWithDb db cat dog
    writeResultFile outfile result

Once built as per the instructions in the repo, it can be run with:


  dist/build/improved/improved Naive/Cat.hs Naive/Dog.hs /dev/null
  Reading Cat file 'Naive/Cat.hs'
  Reading Dog file 'Naive/Dog.hs'.
  Writing Result file '/dev/null'.

The above code is pretty naive and there is zero indication of what can and cannot fail or how it can fail. Here's a list of some of the obvious failures that may result in an exception being thrown:

Either of the two readFile calls.
The writeFile call.
The parsing functions parseCat and parseDog.
Opening the database connection.
The database connection could terminate during the processing stage.

So lets see how the use of the standard Either type, ExceptT from the transformers package and combinators from Gabriel Gonzales' errors package can improve things.

Firstly the types of parseCat and parseDog were ridiculous. Parsers can fail with parse errors, so these should both return an Either type. Just about everything else should be in the ExceptT e IO monad. Lets see what that looks like:


  {-# LANGUAGE OverloadedStrings #-}
  import           Control.Exception (SomeException)
  import           Control.Monad.IO.Class (liftIO)
  import           Control.Error (ExceptT, fmapL, fmapLT, handleExceptT
                                 , hoistEither, runExceptT)

  import           Data.ByteString.Char8 (readFile, writeFile)
  import           Data.Monoid ((<>))
  import           Data.Text (Text)
  import qualified Data.Text as T
  import qualified Data.Text.IO as T

  import           Improved.Cat (Cat, CatParseError, parseCat, renderCatParseError)
  import           Improved.Db (DbError, Result, processWithDb, renderDbError
                               , renderResult, withDatabaseConnection)
  import           Improved.Dog (Dog, DogParseError, parseDog, renderDogParseError)

  import           Prelude hiding (readFile, writeFile)

  import           System.Environment (getArgs)
  import           System.Exit (exitFailure)

  data ProcessError
    = ECat CatParseError
    | EDog DogParseError
    | EReadFile FilePath Text
    | EWriteFile FilePath Text
    | EDb DbError

  main :: IO ()
  main = do
    args <- getArgs
    case args of
      [inFile1, infile2, outFile] ->
              report =<< runExceptT (processFiles inFile1 infile2 outFile)
      _ -> do
          putStrLn "Expected three file names, the first two are input, the last output."
          exitFailure

  report :: Either ProcessError () -> IO ()
  report (Right _) = pure ()
  report (Left e) = T.putStrLn $ renderProcessError e


  renderProcessError :: ProcessError -> Text
  renderProcessError pe =
    case pe of
      ECat ec -> renderCatParseError ec
      EDog ed -> renderDogParseError ed
      EReadFile fpath msg -> "Error reading '" <> T.pack fpath <> "' : " <> msg
      EWriteFile fpath msg -> "Error writing '" <> T.pack fpath <> "' : " <> msg
      EDb dbe -> renderDbError dbe


  readCatFile :: FilePath -> ExceptT ProcessError IO Cat
  readCatFile fpath = do
    liftIO $ putStrLn "Reading Cat file."
    bs <- handleExceptT handler $ readFile fpath
    hoistEither . fmapL ECat $ parseCat bs
    where
      handler :: SomeException -> ProcessError
      handler e = EReadFile fpath (T.pack $ show e)

  readDogFile :: FilePath -> ExceptT ProcessError IO Dog
  readDogFile fpath = do
    liftIO $ putStrLn "Reading Dog file."
    bs <- handleExceptT handler $ readFile fpath
    hoistEither . fmapL EDog $ parseDog bs
    where
      handler :: SomeException -> ProcessError
      handler e = EReadFile fpath (T.pack $ show e)

  writeResultFile :: FilePath -> Result -> ExceptT ProcessError IO ()
  writeResultFile fpath result = do
    liftIO $ putStrLn "Writing Result file."
    handleExceptT handler . writeFile fpath $ renderResult result
    where
      handler :: SomeException -> ProcessError
      handler e = EWriteFile fpath (T.pack $ show e)

  processFiles :: FilePath -> FilePath -> FilePath -> ExceptT ProcessError IO ()
  processFiles infile1 infile2 outfile = do
    cat <- readCatFile infile1
    dog <- readDogFile infile2
    result <- fmapLT EDb . withDatabaseConnection $ \ db ->
                 processWithDb db cat dog
    writeResultFile outfile result

The first thing to notice is that changes to the structure of the main processing function processFiles are minor but all errors are now handled explicitly. In addition, all possible exceptions are caught as close as possible to the source and turned into errors that are explicit in the function return types. Sceptical? Try replacing one of the readFile calls with an error call or a throw and see it get caught and turned into an error as specified by the type of the function.

We also see that despite having many different error types (which happens when code is split up into many packages and modules), a constructor for an error type higher in the stack can encapsulate error types lower in the stack. For example, this value of type ProcessError:


  EDb (DbError3 ResultError1)

contains a DbError which in turn contains a ResultError. Nesting error types like this aids composition, as does the separation of error rendering (turning an error data type into text to be printed) from printing.

We also see that with the use of combinators like fmapLT, and the nested error types of the previous paragraph, means that ExceptT monad transformers do compose.

Using ExceptT with the combinators from the errors package to catch exceptions as close as possible to their source and converting them to errors has numerous benefits including:

Errors are explicit in the types of the functions, making the code easier to reason about.
Its easier to provide better error messages and more context than what is normally provided by the Show instance of most exceptions.
The programmer spends less time chasing the source of exceptions in large complex code bases.
More robust code, because the programmer is forced to think about and write code to handle errors instead of error handling being and optional afterthought.

Want to discuss this? Try reddit.

Posted at: 12:22 | Category: CodeHacking/Haskell | Permalink

Sun, 30 Apr 2017

What do you mean ExceptT doesn't Compose?

Categories

Syndication

Planets

Links

File Validators