Postgresql - Idiomatic way to define new vs persisted types in Haskell

Razumov
February 18, 2024
206 views
0 votes
2 Answers

I have a type that represents a persisted record. I want to have a very similar type that represents data that should be POSTed to create a new record.

This is the full type:

data Record = Reading
  { id: UUID
  , value: String
  ...
  }

the "new" type is the same minus the "id", which will be auto-generated by the db. How can I define this type? I am using servant to define the API.

My current strategy is to prefix the type and all fields with "new", which works but is reduntant for many-field models. I have also seen the nested strategy where I have a common shared type. I’ve also thought about making the id optional, but I really don’t want to be able to post it.

Answers

You can implement this with a Higher Kinded Data like approach.

First, some imports:

{-# LANGUAGE StandaloneDeriving #-}
{-# LANGUAGE UndecidableInstances #-}
module Example where

import Data.Functor.Identity
import Data.Proxy
import Data.UUID 
import Data.Aeson
import GHC.Generics

Then, define a record with a higher kinded type parameter:

data Record f = Record  { recordId :: f UUID, recordValue :: String } deriving (Generic)

The Identity Functor gives you a variant on this record which always has an Id.

type RecordWithId = Record Identity

Using Maybe gives you a variant with an optional id.

type RecordWithOptionalId = Record Maybe

Proxy can be used as a Functor with a single uninteresting "unit" value. (and no values of the wrapped type). This lets us create a type for a Record with no ID.

type RecordWithoutId = Record (Proxy)

We can derive Show for our Record.

deriving instance (Show (f UUID)) => Show (Record f)

Passing omitNothingFields = True and allowOmitedFields = True in the Aeson instances is required to parse a RecordWithoutId as you’d expect. This does require a version of Aeson >= 2.2.0.0 (which as of writing is more recent than the latest Stackage Snapshot). You could probably implement the Aeson instances by hand if this version bound doesn’t work for you.

instance (ToJSON (f UUID)) => ToJSON (Record f) where
    toJSON = genericToJSON defaultOptions { omitNothingFields = True, allowOmitedFields = True }

instance (FromJSON (f UUID)) => FromJSON (Record f) where
    parseJSON = genericParseJSON defaultOptions { omitNothingFields = True, allowOmitedFields = True }

Encoding a value with an ID:

ghci> import qualified Data.ByteString.Lazy.Char8 as BL
ghci> BL.putStrLn $ encode (Record {recordId = Identity nil, recordValue = "value" })
{"recordId":"00000000-0000-0000-0000-000000000000","recordValue":"value"}

Encoding a value without an ID:

ghci> BL.putStrLn $ encode (Record {recordId = Proxy, recordValue = "value" })
{"recordValue":"value"}

Decoding a value without an ID

ghci> decode "{"recordValue":"value"}" :: Maybe RecordWithoutId
Just (Record {recordId = Proxy, recordValue = "value"})

- Ben
- February 18, 2024 at 5:07 am
- 0 votes
0
Another approach I’ve used a bit is to define my records without an ID, and use a single generic wrapper type that can add an ID to anything.

Something like this (where I used the term "key" rather than "id" mainly to avoid clashing with the built in id function, and partially because "Keyed" had a nicer ring to it than "Ided" or "WithId", or whatever):
```
newtype Key a = Key UUID

data Keyed a = Keyed
  { key :: Key a
  , value :: a
  }

data Student = Student
  { name :: Text
  , course :: Course
  , ...
  }
```
My reasoning was that a value of the type Student on its own is just an immutable value. If you change the name of that record you simply compute a different Student value, rather than a representation of a concept like "a student has changed their name". We usually fix this by adding an ID value that is supposed to uniquely correspond to the larger more abstract concept of a real student, so that we can tell when two student records are about the same real student (and so that other records can refer to the real student as well).

But associating an identity value like this is an additional (and significant) add-on on top of the raw record data, and it’s a feature that usually relies on the centralised position of a server or database. But it’s also a separate feature that can be applied to any data record. An id/key is not logically a part of a student record, it’s a feature of our system that allows different student records that exist at different times to be associated with the larger concept of a student in the real world who changes over time. So I find it very useful to have different types for a record that has been associated with an external concept by adding an id, and for just the record itself.

Then I would use Student in APIs where a client was only talking about the details of some student (such as posting data to instruct the server to create a new student record). And I can use Key Student where the client is only dealing in identifying a particular student (such as getting the current details of a student for whom the client already knows the ID), and Keyed Student when dealing with the record contents in the context of it having been assigned an ID (such as when posting an update).

But this separation isn’t just for type-checking the API requiring the id/key field to be present or absent. I actually use both types within my client and server codebases, not just for communication between the two. Client-level code frequently just isn’t in a position to do anything sensible with IDs other than receive them from the server and pass them back unaltered, but does need to actually deal in IDs to properly form requests so I can’t just hide the ID field from the client entirely. I find it very useful to be able to write parts of my code where they can’t possibly mess up ID handling because they’re never given a type that contains the ID in the first place. For example, a form for editing student details almost certainly should only deal in Student, because it shouldn’t allow editing of the id (a nice side effect is this then makes it easy to reuse the same form for creating new students and editing existing ones).

The HKD pattern suggested by Joe works just as well for creating this separation of types. However the Keyed type (and Key if you like the phantom type parameter to help catch mixups) and associated helpers is something you only have to write once, whereas the HKD pattern is something you apply to every record. I’ve found the HKD pattern can also sometimes get in the way of writing other instances, because unless you write the instance specifically for one of the intended variants you always have a field typed f UUID for an unknown f, which is difficult to work with; the least bothersome manifestation is probably that a simple deriving Show or deriving Eq no longer works; you need to use standalone deriving to be more explicit about the constraints on the derived instance. With the approach of a wrapper type adding the id, both the plain record and the Keyed wrapper are bog-standard simple data types.

(And note that the article Joe linked is using an example where a single higher-kinded f parameter is applied uniformly to every field, to easily generate variant types where all of the fields might be missing in order to handle things like form validation. This is quite a different use case to having a single field where you want variants of the data type to have that field present or not. If you like the HKD technique it’s really easy to come up with several other different use cases requiring a higher-kinded type parameter to be applied to different fields, and when try supporting multiple of these use-cases at the same time you have to add multiple higher-kinded type parameters that you can vary for different reasons, and everything just becomes a lot more messy. The HKD technique is one I quite like, but I’ve found it best to use in types that are quite "contained", like a type that is only used to process a form)

The main drawbacks of the id-wrapper approach are:
1. You do have to write a bit of extra code applying and removing the wrapper at the boundaries between parts of the system that deal with the larger mutable concept and parts that just deal with the simple data itself. A few simple helper functions (or lenses) go a long way, but it’s still there.
2. Existing external systems are not usually structured this way, so if you’re interfacing with those the mismatch can be a little irksome. (e.g. if you’re using an external JSON API, it would take fairly custom instances like ToJSON a => ToJSON (Keyed a) to translate, and in the worst case you might not be able to handle it with a single generic instance). It works best when you’re in control of both the client and the server and can write the APIs to fit your data types, rather than writing data types to fit existing APIs.
3. Applying and removing wrappers has a performance cost. In many contexts it would be negligible, but it’s non-zero.
Login or Signup to reply.

Please signup or login to give your own answer.

Click here to cancel reply.

Postgresql – Idiomatic way to define new vs persisted types in Haskell

Answers