Domain Modelling with Haskell: Data Structures

Haskell is an amazing language for domain modelling, with its expressive data types and highly reusable abstractions. In this episode we will design the core of a simple project management system, laying the groundwork for later episodes in this small series.

Show Notes

We will model a basic project management system. This episode lays the foundation on which we will refactor and evolve later on.

The project module holds the core data structures. In it, we need to enable the GeneralizedNewtypeDeriving extension for newtypes.

{-# LANGUAGE GeneralizedNewtypeDeriving #-}

module Project where

import           Data.Text (Text)

We’ll be working with money, so we create a newtype around a Double¹. We call the field unMoney to be able to unwrap it.

newtype Money = Money
  { unMoney :: Double
  } deriving (Show, Eq, Num)

The same goes for ProjectId; it’s a newtype wrapper around an Int.

newtype ProjectId = ProjectId
  { unProjectId :: Int
  } deriving (Show, Eq, Num)

The central data type in this system is the Project, which is either a single project with an ID and name, or a project group, which has a name and a list of sub-projects.

data Project
  = Project ProjectId
            Text
  | ProjectGroup Text
                 [Project]
  deriving (Show, Eq)

The Budget datatype holds two fields; the income and the expenditure of the budget.

data Budget = Budget
  { budgetIncome      :: Money
  , budgetExpenditure :: Money
  } deriving (Show, Eq)

The Transaction is a very simplified model, which is either a sale of some money, or a purchase of some money.

data Transaction
  = Sale Money
  | Purchase Money
  deriving (Eq, Show)

In the Demo module, we’ll disable GHC’s warning for unused imports, as we’ll be using it for REPL development. Importing the Project module, we start writing some data based on our data structures.

{-# OPTIONS_GHC -fno-warn-unused-imports #-}
{-# LANGUAGE OverloadedStrings #-}

module Demo where

import           Project

someProject :: Project
someProject = ProjectGroup "Sweden" [stockholm, gothenburg, malmo]
  where
    stockholm = Project 1 "Stockholm"
    gothenburg = Project 2 "Gothenburg"
    malmo = ProjectGroup "malmo" [city, limhamn]
    city = Project 3 "malmo City"
    limhamn = Project 4 "Limhamn"

The domain model is very generic so far, usable for whatever projects one would like to track. Here I’m using it to model something location-based in Sweden, like construction sites, or something.

Okay, we now have some data, and we can try it out in the REPL.

*Demo> someProject
ProjectGroup "Sweden" [Project (ProjectId {unProjectId = 1})
"Stockholm",Project (ProjectId {unProjectId = 2}) "Gothenburg",
ProjectGroup "Malmo" [Project (ProjectId {unProjectId = 3})
"Malmo City",Project (ProjectId {unProjectId = 4}) "Limhamn"]]

Yeah, it works. But now our customer wants some reporting. We’ll create a new module called Database, where we will implement some fake queries to work with. In a real system, these would likely be proper database queries.

To generate some fake data, we’ll use the System.Random module. We’ll also need our Project module, and then we can write queries for budgets and transactions. Both functions ignore the project IDs, and generate money values with certain ranges, constructing budgets and lists of transactions, respectively.

module Database where

import           System.Random (getStdRandom, randomR)

import Project

getBudget :: ProjectId -> IO Budget
getBudget _ = do
  income <- Money <$> getStdRandom (randomR (0, 10000))
  expenditure <- Money <$> getStdRandom (randomR (0, 10000))
  pure Budget { budgetIncome = income
              , budgetExpenditure = expenditure
              }

getTransactions :: ProjectId -> IO [Transaction]
getTransactions _ = do
  sale <- Sale . Money <$> getStdRandom (randomR (0, 4000))
  purchase <- Purchase . Money <$> getStdRandom (randomR (0, 4000))
  pure [sale, purchase]

We can now go ahead and implement our reporting module. We’ll import the gemSum function from Data.Monoid, and our own modules.

module Reporting where

import           Data.Monoid (getSum)

import qualified Database    as DB
import           Project

The Report is a data structure of three fields; the budgeted profit, the actual net profit, and the difference between the two.

data Report = Report
  { budgetProfit :: Money
  , netProfit    :: Money
  , difference   :: Money
  } deriving (Show, Eq)

We’ll create a pure function called calculateReport. The budget profit is defined as the difference between the budgeted income and the budgeted expenditure. The net profit is defined as the sum of the profit of all transactions. A sales transaction is considered a profit, and a purchase is negated.

calculateReport :: Budget -> [Transaction] -> Report
calculateReport budget transactions =
  Report
  { budgetProfit = budgetProfit'
  , netProfit = netProfit'
  , difference = netProfit' - budgetProfit'
  }
  where
    budgetProfit' = budgetIncome budget - budgetExpenditure budget
    netProfit' = getSum (foldMap asProfit transactions)
    asProfit (Sale m)     = pure m
    asProfit (Purchase m) = pure (negate m)

We’ll now use our fake DB queries to calculate a report for a project. It’s defined with the calc function, which given a single project will use our pure calculateReport function over the results of the getBudget and getTransactions queries. Given a project group it will fold over the projects using the calc function, and thus recurse.

calculateProjectReport :: Project -> IO Report
calculateProjectReport = calc
  where
    calc (Project p _) =
      calculateReport <$> DB.getBudget p <*> DB.getTransactions p
    calc (ProjectGroup _ projects) = foldMap calc projects

For this to work, we need a Monoid instance for Report, to be able to concatenate them. The empty element is a report with all zeros. The mappend implementation will take the budgeted profit, the net profit, and the difference, of both reports, and sum them up.

instance Monoid Report where
  mempty = Report 0 0 0
  mappend (Report b1 n1 d1) (Report b2 n2 d2) =
    Report (b1 + b2) (n1 + n2) (d1 + d2)

Let’s go back to the Demo module and try this out in the REPL. Given that we have a project, we can calculate a report.

*Demo> calculateProjectReport someProject
Report {budgetProfit = Money {unMoney = -5392.74046336179},
netProfit = Money {unMoney = 2191.2802854168813}, difference =
Money {unMoney = 7584.020748778671}}

There we go! We would like to format it a bit nicer, though.

We’ll create a new module called PrettyPrint with some imports. Most noteworthy is Data.Tree, which we will use to visualize our data structure.

{-# LANGUAGE OverloadedStrings #-}

module PrettyPrint where

import qualified Data.Text   as Text
import           Data.Tree
import           Text.Printf

import           Project
import           Reporting

Now we can write asTree, a function from a Project to a Tree with String labels. This is a data structure in the containers package that is very useful for visualizing nested data. We pattern match on the project, and given a single project we create a node, with a label produced by printf, and with no child nodes. Given a project group, we use the name as a label, and map the asTree function over the subprojects to create child nodes.

asTree :: Project -> Tree String
asTree project =
  case project of
    Project (ProjectId p) name ->
      Node (printf "%s (%d)" name p) []
    ProjectGroup name projects ->
      Node (Text.unpack name) (map asTree projects)

The prettyProject function is a little helper that composes Data.Tree.drawTree with our asTree function.

prettyProject :: Project -> String
prettyProject = drawTree . asTree

The prettyReport function uses printf to create a single-line string, with a budget, the net profit, and the difference. The %.2f used in the printf strings format the Double values with two decimals. The last one always includes a sign. The unMoney function unwraps a Money value, picking out the Double.

prettyReport :: Report -> String
prettyReport r =
  printf
    "Budget: %.2f, Net: %.2f, difference: %+.2f"
    (unMoney (budgetProfit r))
    (unMoney (netProfit r))
    (unMoney (difference r))

Back in the Demo module, we import our PrettyPrint module and load it in the REPL. We can print the pretty version of a Project:

*Demo> putStrLn (prettyProject someProject)
Sweden
|
+- Stockholm (1)
|
+- Gothenburg (2)
|
`- Malmo
   |
   +- Malmo City (3)
   |
   `- Limhamn (4)

And we can calculate a project report and print it, using the prettyReport function.

*Demo> pr <- calculateProjectReport someProject
*Demo> putStrLn (prettyReport pr)
Budget: 3954.27, Net: -268.75, Difference: -4223.02

And that is it for this episode on domain modelling with Haskell! In the next episode we will see how we can generalize our model, using Functor, Traversable, and Foldable. Stay tuned, and thanks for watching!

Source Code

The source code for the full series is available at github.com/haskell-at-work/domain-modelling-with-haskell.

The use of Double for modelling money is not well-suited for real systems, and should be replaced with a proper arbitrary-precision data type, e.g. Scientific. In the spirit of keeping down the scope of this tutorial Double was still used.