Let's Talk About Functors In Rescript

Introduction

For basic syntax, functional programming does not feel that different from other paradigms. Sure, data and behaviour are separate, so you don’t have classes or objects or inheritance, but it feels relatively the same. This is especially true in Rescript, with the pipe-first syntax almost looking like a method lookup (comparable to the self object in Python).

But if you go deeply into a study of functional languages, you start encountering bizarre words such as “monad” and “functor”. Again, this is less true in Rescript, which emphasizes pragmatism over purity. Indeed, if you search the rescript documentation for “monad”, it will come up blank, and if you search for “Functor” you get a fairly short section that almost feels like an afterthought.

Functors are not a core feature of Rescript, but they are a useful abstraction that I wanted to understand well enough to explain them clearly.

Patreon

This article is part of a series on Rescript programming, though it is another stand-alone article with no hard dependencies on my earlier tutorials.

This series takes a lot of time and dedication to write and maintain. The main thing that has kept me invested in this writing this series is support from my Patrons.

Other ways to show support include sharing the articles on social media, commenting on them here, or a quick thank you on the Rescript forum.

But first, let’s think about generic types

Reading the Rescript docs, a Functor is “a function that accepts a module and returns a module”. When I first read this, I assumed it was somewhat analogous to a class decorator in Python, which usually accepts a class and returns a slightly modified version of that class.

But as I dug deeper, I discovered that a functor is more about types, and the fact that it’s operating on modules is kind of an afterthought.

So in order to reason about functors, let us first reason about types. Rescript already has a concept of generic types. We use them, for example, in generic collections (e.g array<string> vs array<int>), and you can define your own generic records as follows:

type keyValuePair<'a, 'b> = {
  name: 'a,
  value: 'b,
}

This type is generic across two types, labelled 'a and 'b. You can generate an infinite number of specific types from this generic type. For example, you could instantiate a version of thys type that accepts a string and an integer:

let hundredsOfApples: keyValuePair<string, int> = {
  name: "apple",
  value: 480,
}

You can also create a specific type from a generic type:

type twoStrings = keyValuePair<string, string>

And you can even create a new generic type from an existing generic type. This one only has one type parameter:

type stringValuePair<'a> = keyValuePair<string, 'a>

Note that I intentionally abused 'a above for pedagogical purposes. This 'a is completely different from the 'a in type keyValuePayr<'a, 'b>. In fact, this 'a is being assigned to the 'b in keyValuePair! In both cases, ‘`a’ just means “some type that I don’t care what it is right now because I am generic”.

Incidentally, hundredsOfApples is a valid stringValuePair<int>.

Adding functions to the mix

You can pass a generic type into a function and have it return a generic type as the response. For example, this is a valid Rescript function:

let extractValue: keyValuePair<'a, 'b> => 'b = pair => pair.value

so is this:

let valueIfName: (keyValuePair<string, 'b>, string) => option<'b> = (pair, nameToMatch) => {
  if pair.name == nameToMatch {
    Some(pair.value)
  } else {
    None
  }
}

In the above example, note that the return value is a new generic type (an option, which is built-in to Rescript), and it’s value must be the same type as the value of the pair that is passed in, defined as: 'b.

You can make a generic function in Rescript where the argument itself is generic like this:

let showAndReturn: 'a => 'a = argument => {
  Js.log(argument)
  argument
}

This function will work by passing in any type. However, there is no way to say you want to specialize that function. For example, this is valid Typescript (note the last line):

// Typescript, not Rescript
function showAndReturn<atype>(argument: atype): atype {
    console.log(argument)
    return argument
}

let m = showAndReturn<string>("hello");

The advantage of the typescript function over Rescript is that you can specialize the type when you call it, such that the following is a compile-time error:

// Still typescript, won't compile
let m = showAndReturn<int>("hello");

There is no direct analogue in Rescript, that I know of. However, functors (we’ll get to them eventually, I promise) are one way to provide that generic typing.

Let’s think about a database

When we first started this discussion, we noted that a functor is “a function that accepts a module and returns a module.” But I don’t find that a very useful description. Instead, I think of a functor as a “generically typed module.” This is a pretty limited definition compared to the entirety of things a functor can do, but it is probably the definition that is used most often.

Let’s set the stage by using the same example I used in my previous article on poly variants: a database-like store that only allows certain types to be inserted into certain tables.

In our example, we’ll have three separate types for storing friends, dogs, and books. Each of these has a different structure:

type friend = {
  name: string,
  age: int,
}

type dog = {
  name: string,
  colour: string,
}

type book = {
  title: string,
  author: string,
}

Each of these will go in the database in separate tables. What I’m looking for is a function that would be called like this:

add("dogs", {name: "lassie", colour: "red and white"})

add("friends", {name: "ed", age: 14})

but not (with a compile-time error)

add("books", {"name": "ed", age: 14})

One less-than-desirable way we could implement this is by having a module that exposes addDog, addFriend, and addBook but not the general add function, as follows:

module type DatabaseInterface = {
  let addDog: dog => unit
  let addFriend: friend => unit
  let addBook: book => unit
}

module Database: DatabaseInterface = {
  @module("database") external add: (string, 'a) => unit = "add"

  let addDog = dog => add("dogs", dog)
  let addFriend = friend => add("friends", friend)
  let addBook = book => add("books", book)
}

Database.addDog({name: "lassie", colour: "red and white"})

The add external binds to some mythical database function that accepts any string and any object. However, that function is not visible “outside” the Database module because add is not part of the DatabaseInterface type. This means you can’t accidentally or maliciously call Database.add("books", {name: "ed", age: 14}) to add a person to the books database. (Something that is all too easy in Javascript!)

This is all great; I can start using this interface and everything will be strongly typed. However, there are a few problems:

Every time I want to add a new function to the database module (e.g put or get or bulkAdd…) I have to add a separate call for each table in the database.
Similarly, every time I want to add a new table to the database model, I have to add a separate call for each of those functions in the Database module.
It’s not possible to create generic bindings to a specific table. This whole project started when I was trying to model the exceptional Dexie database in Rescript. If I publish Dexie bindings for your consumption, it won’t be much good to you if the only tables it can handle are friend, dog, and books!

So what we really want to do is kind of generate separate copies of the Database module for each of the types that you might want to model. and that is where functors come in (finally, I know).

Functors

If we look at our current DatabaseInterface type, we can see two pieces that vary: the name of the table and the type of the value that can go into that table. Let’s start our exploration of functors by creating a SchemaItem module type that models both:

type schemaId = string

module type SchemaItem = {
  type t
  let tableName: schemaId
}

I put the schemaId type so you are less likely to call it with the wrong kind of string, but it’s just a string.

We can now create three separate instances of this type for our three tables:

module FriendSchema = {
  type t = friend
  let tableName = "friends"
}

module DogSchema = {
  type t = dog
  let tableName = "dogs"
}

module BookSchema = {
  type t = book
  let tableName = "books"
}

Note that you can make a separate set of SchemaItem modules for your own database; perhaps you are tracking aircraft or tv shows instead (though this clearly suggests you don’t have enough friends, dogs, or books in your life).

Tip: You might be tempted to type these modules like this:

// This won't work
module FriendSchema: SchemaItem = {
  type t = friend
  let tableName = "friends"
}

However, this over-specifies the type such that Friend.t refers to SchemaItem.t which is actually not a friend at all. I got hung up on this for a while before I finally got my functor to work.

Next up, we need a way to generate a version of the original Database module given one of these schemas. Effectively, we need a generic Database module that has a some specific instance of the SchemaItem module as its generic type.

In completely different words, in fact, what we need is… you guessed it, “a function that accepts a module and returns a module”.

The syntax for a functor is kind of like a mashup of the syntax for defining a module and the syntax for defining a function. The functor “signature” looks exactly like a function call with parameters and a fat arrow, except that the name must be capitalized and you use a module keyword instead of a let to do the assigning:

module MakeSchema = (Schema: SchemaItem) => {}

In this case, the “argument” is another module, which must be an instance of the SchemaItem type, but is otherwise completely generic. You cannot pass arguments into a functor that are not modules, but you can pass multiple modules into one.

The contents of the functor body (the {}) are more like a module definition than a function. Instead of the last executed expression being the “return value” of the functor, the entire body is the returned module. It contains type and let definitions for an entire new module that is (effectively) “generated” on the fly. Here’s one that specifies a couple of simple functions:

module MakeSchema = (Schema: SchemaItem) => {
  let add: Schema.t => unit = item => Database.add(Schema.tableName, item)
  let get: int => option<Schema.t> = id => Database.get(id)
}
'''

This presupposes the existince of a `Database` module that has bindings
something like this:

```reason
module Database = {
  @module("database") external add: (string, 'a) => unit = "add"
  @module("database") external get: int => 'a = "get"
}

Note: the astute reader may be thinking “But that Database module would allow me to put the wrong type in any table” (either through malice or if your brain overheated). You’re right, astute reader. We’ll fix it in a bit.

The MakeSchema functor “returns” a module that has two functions, add and get defined on it. We can now “call” this functor with our three SchemaItem instances to generate three different modules that have those two functions:

module Friends = MakeSchema(FriendSchema)
module Dogs = MakeSchema(DogSchema)
module Books = MakeSchema(BookSchema)

Now we can write something like:

Friends.add({name: "lizzi", age: 8})

with correct type safety, as we’ll get a compile-time error if we try to add a book instead:

// compile error
Friends.add({title: "Pyramids", author: "Terry Pratchett"})

Hiding the implementation

That pesky Database module still exists, meaning we can subvert the whole type system as follows:

Database.add("friends", {title: "Pyramids", author: "Terry Pratchett"})

We need to find a way to hide that little detail. We can start by moving Database into the MakeSchema functor:

module MakeSchema = (Schema: SchemaItem) => {
  module Database = {
    @module("database") external add: (string, 'a) => unit = "add"
    @module("database") external get: int => 'a = "get"
  }
  let add = (item: Schema.t) => Database.add(Schema.tableName, item)
  let get = id => Database.get(id)
}

However, this just moves the problem down a level; we can still do this, which seems even more confusing:

Friends.Database.add("friends", {title: "Pyramids", author: "Terry Pratchett"})

The solution is to add a module type to the functor, just as we did for the DatabaseInterface earlier:

module type MakeSchemaType = (Schema: SchemaItem) =>
{
  let add: Schema.t => unit
  let get: int => option<Schema.t>
}

module MakeSchema: MakeSchemaType = (Schema: SchemaItem) => {
    module Database{...}
    ...
}

Now Friends.Database is not accessible and there is no way to accidentally add a book.

The input modules

I have been thinking of the module that is passed into the functor as being little more than a collection of related types and maybe a few constant values as with the tableName above.

But there are actually no constraints on that input module. It can have all the features of any other module, including functions that can be called and nested modules. A great example is the one given in the Rescript documentation:

module type Comparable = {
  type t
  let equal: (t, t) => bool
}

module MakeSet = (Item: Comparable) => {
  // let's use a list as our naive backing data structure
  type backingType = list<Item.t>
  let empty = list{}
  let add = (currentSet: backingType, newItem: Item.t): backingType =>
    // if item exists
    if List.exists(x => Item.equal(x, newItem), currentSet) {
      currentSet // return the same (immutable) set (a list really)
    } else {
      list{
        newItem,
        ...currentSet // prepend to the set and return it
      }
    }
}

In this example, the Comparable module provides a “generic” equal function that is called inside the MakeSet functor to check for containment.

In Closing

Hopefully you understand functors a little better now (I know I do). They are a very powerful language feature, and like all powerful language features, they can be overused and abused. I think they are most useful when writing libraries or frameworks than in regular application development. Typically, Rescript is all about the Javascript bindings, though this will likely change as more and more pure-rescript libraries are added to the ecosystem. If you are writing such a library, you’ll likely need functors at some point. If not, hopefully they are a useful tool to have in your back pocket for that time when generic types just aren’t quite enough.