Thursday, December 27, 2012

Free monads for structuring Haskell web apps

As it often happens, I found out pretty soon after my last post that there was a much easier way to work with free monads in Haskell already existing. The operational package makes it fun and easy to get started with free monads and avoids a performance problem with the naive implementation.

Purify code using free monads has a great description of how free monads can further isolate IO actions. Free monads allow you to have multiple "run" or "interpret" functions that actually carry out the actions of the monad. This allows you to:

  • Change implementations without changing client code
  • Manage configuration in the run function rather than in client code
  • Run client code in purely functional test harnesses

I've started experimenting with using free monads for these purposes in a web application. It's not in any way ready but the experience has been pretty interesting so far, so I've decided to post about it.

Managing configuration


Yesod and Snap already provide everything needed to make a web application but I wanted to experiment a bit more. I decided to use the Warp webserver directly through the Wai interface, which should expose more gritty configuration issues. I liked Dropwizard's approach: have a Configuration class serialized as YAML and have main read this to configure the app. Eventually I want run time configuration but this is a nice way to start.

So far the only configuration is the location of an sqlite3 database file.

Configuration.yaml


 databaseFile: sandboxData/cakeStore.sqlite3  

Configuration/Types.hs


 {-# LANGUAGE DeriveGeneric #-}  
 module Configuration.Types where  
 import GHC.Generics (Generic)  
 import Data.Aeson (FromJSON, ToJSON)  
 data Configuration = Configuration { databaseFile :: String }  
   deriving (Show, Generic)  
 instance ToJSON Configuration  
 instance FromJSON Configuration  

Configuration/Util.hs

 module Configuration.Util  
 where  
 import Data.Yaml  
 import Configuration.Types  
 readConfiguration :: FilePath -> IO (Maybe Configuration)  
 readConfiguration filePath = decodeFile filePath  

Database setup


We can now use this to read in our database configuration.
So far I've just got a basic script to initialize the database and add tables. As you can see there are still some embarrassing test values hanging around.

Ops/DataSetup.hs

 import Database.HDBC   
 import Database.HDBC.Sqlite3  
 import System.Environment  
 import qualified Configuration.Util as ConfigurationUtil  
 import qualified Configuration.Types as ConfigurationTypes  
 main :: IO ()  
 main = do  
    args <- getArgs  
    let configFile = head args  
    print $ "Reading config file " ++ configFile  
    maybeConfiguration <- ConfigurationUtil.readConfiguration configFile  
    case maybeConfiguration of  
         Just configuration -> runConfiguration configuration  
      Nothing -> return ()  
 runConfiguration configuration = do  
    conn <- connectSqlite3 $ ConfigurationTypes.databaseFile $ configuration  
    tables <- getTables conn  
    print tables  
    withTransaction conn $ createTest tables  
    disconnect conn  
 createTest tables conn = do    
   if not $ elem "users" tables  
      then do  
         run conn "CREATE TABLE users (id INTEGER NOT NULL PRIMARY KEY, name VARCHAR(256))" []  
         run conn "INSERT INTO users (name) VALUES ('DatabaseBob')" []  
         query <- quickQuery' conn "SELECT * from users where id < 2" []  
         print query  
         return ()  
      else return ()  
   if not $ elem "cakes" tables  
      then do  
         run conn "CREATE TABLE cakes (id INTEGER NOT NULL PRIMARY KEY, name VARCHAR(256))" []          
         return ()  
      else return ()  

The setup script can directly connect to the database, but the app itself needs something a little more structured. In Java-land data access objects are usually used to manage persistence details.
This is a good place for the first free monad.

Data/DataHandler.hs (summary)

I'd like to have a data layer, a collection of data access actions that can only (1) perform database transactions (2) call other data access actions. I'll probably have to switch the database library from hdbc to one of the *-simple libraries at one point, and I'd like to only have to change these details in the data access actions when I do. This type signature doesn't lock down the code too much but does provide a hint to other developers that only database access should be done in this monad.

 data DataInstruction a   
   where CallData :: DataCall a -> DataInstruction a  
      WithTransaction :: (Connection -> IO a) -> DataInstruction a  

A quick declaration with the operational package and we have a monad:

 type DataMonad a = Program DataInstruction a  

Now to interpret data access actions, data calls are interpreted by just running them and returning the result. Transactional actions are interpreted by connecting to the sqlite3 database, running the transaction. Eventually this should be changed so the connection should be held until the web request doesn't need it anymore, and connection pooling should be added with resource-pool.

 runDataMonadWithConfiguration :: DataConfiguration -> DataMonad a -> IO a  
 runDataMonadWithConfiguration dataConfiguration = eval.view  
  where   
   eval :: ProgramView (DataInstruction) a -> IO a  
   eval (Return x) = return x  
   eval (CallData (DataCall {execution=exec}) :>>= k ) =   
    do  
     result <- runDataMonadWithConfiguration dataConfiguration exec  
     runDataMonadWithConfiguration dataConfiguration $ k result  
   eval ((WithTransaction trans) :>>= k) =  
    do  
     result <- bracket (connectSqlite3 $ databaseFile $ dataConfiguration)  
              (\conn -> disconnect conn)  
              (\conn -> withTransaction conn trans )  
     runDataMonadWithConfiguration dataConfiguration $ k result   

Data/Users.hs


With the run function handling the connection details, data access actions can be defined in other modules. These modules can now be very light, they only have to import Database.HDBC and Data.DataHandler. I've made a strange choice here of wrapping the actions into a DataCall data structure. This is mostly to experiment with logging and mocking, but this data type also makes a nice hook for other free monads to connect to.

 getUsersCall :: Data.DataHandler.DataCall [[SqlValue]]  
 getUsersCall = Data.DataHandler.DataCall {  
  Data.DataHandler.name = "getUsersCall",  
  Data.DataHandler.execution = selectUsers,  
  Data.DataHandler.provideResult = Data.DataHandler.provideBlank   
  }  
 selectUsers :: Data.DataHandler.DataMonad [[SqlValue]]  
 selectUsers = do  
  let selectUsersQuery connection = quickQuery' connection "SELECT * from users" []  
  users <- Data.DataHandler.withTrans selectUsersQuery   
  return users  

Assembling the data layer


Now there's enough code to put together the data layer and run a test. There's some STM code here that isn't really used, I eventually want to use STM for runtime configuration.

Tests/DataTest.hs

 {-# LANGUAGE OverloadedStrings #-}  
 module Tests.DataTest  
 (tests)  
 where  
 import Test.HUnit  
 import Data.DataHandler  
 import Data.Users  
 import qualified Configuration.Util  
 import Control.Concurrent.STM  
 import Database.HDBC  
 tests = TestList [  
  TestLabel "database test"  
  $ TestCase $ do  
   Just configuration <- Configuration.Util.readConfiguration "Configuration.yaml"  
   configurationTVar <- setupDataMonad configuration  
   dataConfiguration <- atomically $ readTVar configurationTVar  
   users <- handleWithConfiguration dataConfiguration getUsersCall    
   assertBool "Successfully queried" (users == [[SqlByteString "1",SqlByteString "DatabaseBob"],[SqlByteString "2",SqlByteString "Steve"]])  
  ]   

Service layer


From here we just continue the plumbing up the stack. Service layer actions can only call other service layer actions or make data layer calls.
 data ServiceInstruction a  
   where CallService :: ServiceCall a -> ServiceInstruction a  
      CallData :: Data.DataHandler.DataCall a -> ServiceInstruction a  

Tests/ServiceTest.hs

  TestLabel "service to database test"  
  $ TestCase $ do  
   Just configuration <- Configuration.Util.readConfiguration "Configuration.yaml"  
   configurationTVar <- Data.DataHandler.setupDataMonad configuration  
   serviceConfiguration <- Service.ServiceHandler.setupServiceMonad configuration configurationTVar  
   user <- Service.ServiceHandler.handleWithConfiguration serviceConfiguration Service.Users.getUser  
   assertBool "Some users" ( (length user) > 0)  

Web layer

Finally up to the web layer. The web layer can only make service calls and render bytestrings to the client. The ServiceCall type prevents the web layer from making data calls directly. This layer needs a lot more work, especially parameter parsing, but the nice thing is that these actions should be easily runnable within other web servers/frameworks if needed. Handling the Wai interface is moved to the run function so most of the application code does not need to know about it.

Web/WebHandler.hs

 data HandlerInstruction a  
   where RenderView :: BlazeBuilder.Builder -> HandlerInstruction ()  
      CallService :: ServiceHandler.ServiceCall a -> HandlerInstruction a  

All view rendering can then be pulled into separate modules.
 listCakes :: Web.WebHandler.HandlerMonad ()  
 listCakes = do      
  cakes <- Web.WebHandler.callService Service.Cakes.getCakes  
  Web.WebHandler.renderView $ Web.View.Cake.render $ cakes  
  return ()  

Summary

So far with this experiment I've found:
  • Free monads can be used to separate application layers in a fairly usable way.
    • This adds a decent amount of extra code in the beginning but provides a lot of flexibility and I expect this code would stay fairly constant as the application grows.
    • A command line application that hits the service layer was easy to make.
  • Moving configuration management to monad run functions is handy. An interesting possibility is to have the run function keep a reference to its configuration in STM. The run function could then be configured by passing it configuration actions that change the configuration.
  • Wrapping actions allows some new options for organizing code although it is ugly. Some of these could probably be achieved with newtype, but there's still more experimenting to do. One idea is restricting the web layer to only call AuthenticatedServiceCalls but allowing the service layer to call both AuthenticatedServiceCalls and plain ServiceCalls

Mock testing and free monads:


I'd really like to do some mock testing with free monads. Growing Object Oriented Software Guided by Tests is one of my favourite books and I think a lot of its ideas can be used in building Haskell applications. It will probably take some work to translate the concepts over though. I certainly haven't done TDD with the code so far, but in its current state it doesn't seem too far off.

Mock testing would do a lot to help this situation. A workable way to program in action return values could also lead to a very powerful way to test applications using LogicT or QuickCheck to generate expectations.

I managed to hack in some sort of mocking using Data.Dynamic but it's not pretty. This test case happens to "know" that the only DataCall made will return a list of Cakes, but I really want to match on the signature of the call and then provide the appropriate data back.

  TestLabel "mock getCakes"  
   $ TestCase $ do    
    let interpretGetCakes = eval . view  
      eval :: ProgramView (Service.ServiceHandler.ServiceInstruction) a -> IO a  
      eval (Return x) = return x       
      eval (Service.ServiceHandler.CallData call :>>= k) = do  
       let result = Data.DataHandler.provideResult call (toDyn [Model.Cake.Cake 0 "Black Forest"])  
       interpretGetCakes $ k result  
    result <- interpretGetCakes $ Service.ServiceHandler.execution Service.Cakes.getCakes  
    assertBool "Expecting cakes" (result == [Model.Cake.Cake 0 "Black Forest"])  


Everything here is still a work in progress but it's worked out a lot better than I would have thought so far. All the code here is available at https://github.com/stevechy/HaskellCakeStore which I hope to keep updating it as the experiments continue.