Redis Data Modeling with Rank 2 Types

Usually when we discuss polymorphism in programming, what we mean is that a structure or function is universally quantified over a type variable. For example, it makes sense to talk about List a as a concept without any concern for what a may be chosen. From here, we can talk about various properties of polymorphic structures, like is it a Functor or a Traversable. But polymorphism over a single type variable is not the only kind of polymorphism! What if instead of quantifying a type, we instead quantified over things quantifying over a type? This is rank 2 polymorphism. In this article, I want to demonstrate a simple use of rank 2 polymorphism to construct interesting Redis queries.

Consider the following polymorphic data type:

-- rank 2 polymorphic type
data User f = User {
    weight :: f Mass
  , height :: f Length
  , firstName :: f String
  , lastName :: f String
  , birthday :: f UTCTime
  }

It looks almost the same as any normal record structure, except that it has that strange quantified f which is applied to every field. What does that even mean? Well, first consider the simplest example, f = Identity: this would mean every field is wrapped in the Identity functor, which means it conceptually would just be a standard record type. How about f = Maybe? This is already more interesting: it means that each field of the record is individually optional.

And in fact, this is all the understanding you need to use the resulting API from this article! The final API will be two queries: setUser :: UserId -> User Maybe -> Redis () and getUsers :: [ UserId ] -> Redis [ User Maybe ], with some interesting semantics - you can provide as many or as few fields as you want when setting the data, empty fields will be ignored (they will not overwrite existing data!) and provided fields will be stored.

Now to actually implement this simple API, things are going to be a bit trickier. The underlying algebraic structure is fairly elegant in my opinion, but working with it in Haskell is unfortunately a bit obtuse, and is pretty annoying to be honest. First, our general concepts like functors, traversables, applicatives, sums, products, etc., all carry over into rank 2 polymorphism: for example, if you give me a User Maybe and a User [], I can give you a User (Maybe :*: []). Already this is becoming slightly awkward, as we're now using :*: to talk about products instead of , like we usually do, and unfortunately, this is how everything will be: we need a new Functor which is conceptually similar to the normal Functor but instead of fmap :: (a -> b) -> f a -> f b it will now operate on rank 2 types and natural transformations (well...), e.g., fmap :: (forall a. f a -> g a) -> r f -> r g, which in our User example, would mean you give me a natural transformation from f to g and I'll give you a function from User f to User g. So does this mean we have to have magical duplicate vocabulary for dealing with the same ideas in a rank 2 space? Well... yes. However, the good news is there is a package on hackage which declares all of these and even provides template Haskell utilities for deriving them automatically!

So let's bring in that package and check out what that product example I mentioned would look like:

$(Rank2.TH.deriveAll ''User)

productExample :: User Maybe -> User [] -> User (Maybe :*: [])
productExample x y =
  (Rank2.Arrow . (:*:)) Rank2.<$> x Rank2.<*> y

This is essentially (,) <$> x <*> y in normal Rank 1 Haskell speak. So what can we do with this? Well, let's try point-wise data storage! Using my high-level Redis library, we should be able to declare a RedisBasic UserId for each field by simply plugging it in: User (RedisBasic UserId). Unfortunately, for reasons I don't understand but I'm sure are very sensible, this type is not accepted by GHC, but for some reason is accepted if we use a newtype. So, with that in mind, the following is accepted:

newtype RBasic a =
  RBasic { runRBasic :: RedisBasic UserId (Maybe a) }

userStorage :: User RBasic
userStorage =  User {
    weight = RBasic . declareBasic . fromString $ "user-weight"
  , height = RBasic . declareBasic . fromString $ "user-height"
  , firstName = RBasic . declareBasic . fromString $ "user-firstName"
  , lastName = RBasic . declareBasic . fromString $ "user-lastName"
  , birthday = RBasic . declareBasic . fromString $ "user-birthday"
  }

Great! Actually, I don't quite like this, but making it conceptually cleaner is, again, a bit obtuse. Obviously we want to factor out the repeated part, so it seems like we could just have User (Const String) for the name declarations and then map over them. But actually, there's a hidden piece of information above: there's also a Store instance being resolved for each field, and if we try to just map over User (Const String), that Store instance won't be visible, so the compiler will complain. So what we really need is something like User (Store :*: Const String), but obviously that's not even a valid type. But it almost is! And there's another package to help us make it happen:

storageMeta :: User ((Dict :.: Store) :*: Const String)
storageMeta = User {
    weight = Comp1 Dict :*: Const "weight"
  , height = Comp1 Dict :*: Const "height"
  , firstName = Comp1 Dict :*: Const "firstName"
  , lastName = Comp1 Dict :*: Const "lastName"
  , birthday = Comp1 Dict :*: Const "birthday"
  }

Now that's actually quite minimal, and in fact, it looks like something that could trivially be automated with template Haskell - especially if you split them up into User (Dict :.: Store) and User (Const String) and simply combined them using the rank 2 version of (,) <$> x <*> y. I have not done this, but if this were an approach I started taking more in the future, it might be something worth pursuing, as it would completely remove all copy-paste boilerplate. Anyway, so given this storageMeta :: User ((Dict :.: Store) :*: Const String), how do we get back to where we want to be with User RBasic? A simple rank 2 fmap will suffice:

userStorage :: User RBasic
userStorage = Rank2.fmap f storageMeta
  where
    f :: ((Dict :.: Store) :*: Const String) a -> RBasic a
    f ((Comp1 Dict) :*: Const name) =
      RBasic . declareBasic . fromString $ "user-" ++ name

Ok! So now we just need to construct our public API. First, let's do data retrieval. Despite the fact that we're storing each of these fields pointwise-distinctly in Redis, we can still query them all at the same time in a single command (see my previous article for how this is done) because ~~> is Applicative:

userQuery :: UserId ~~> User Maybe
userQuery = Rank2.traverse f userStorage
  where
    f :: RBasic a -> UserId ~~> Maybe a
    f (RBasic x) = liftq x

Almost elegant! In fact, I would export this in the public API, too, as this query can be further combined with whatever others you may wish to make later, but in the interest of exporting a completely trivial API that anyone can use, we'll go ahead and reify to Redis after using the profunctor traversal:

getUsers :: [ UserId ] -> Redis [ User Maybe ]
getUsers = mget $ traverse' userQuery

For setUser :: UserId -> User Maybe -> Redis (), I actually don't have any fancy story for mset yet, although I hope to someday, as there's no reason this shouldn't be possible. Anyway, for now, we just need to do a traversal in the Redis monad and set each field:

setUser :: UserId -> User Maybe -> Redis ()
setUser i u =
  const () <$>
  Rank2.traverse f
    ((Rank2.Arrow . (:*:)) Rank2.<$> u Rank2.<*> userStorage)
  where
    f :: (Maybe :*: RBasic) a -> Redis (Const () a)
    f (Nothing :*: _) = pure $ Const ()
    f (Just x :*: RBasic s) = do
      set' s i x
      pure $ Const ()

You can see by the pattern match, the Nothing case performs no actions, so it does give the desired fancy behavior of taking no action with respect to empty fields.

Anyway, that gives us our simple public API with the fancy behavior as described! As I said at the start, I do quite like this approach conceptually, and it actually has one other benefit that I didn't mention: because paths are declared at field-level granularity, this approach is resilient to changes like adding or removing fields to the overall structure. In contrast, with the traditional naive approach, adding or removing fields would cause old data in the database to be incompatible with the new type, requiring a data migration. So conceptually, it's very clean and it has a lot of advantages; the downside is that hammering out this rank 2 code is fairly painful compared to working with the same typeclasses in rank 1 polymorphic space.

A repository containing compilable code for this article can be found here.