Let's Talk About Functors In Rescript
Introduction
For basic syntax, functional programming does not feel that different from
other paradigms. Sure, data and behaviour are separate, so you don’t have
classes or objects or inheritance, but it feels relatively the same. This is
especially true in Rescript, with the pipe-first syntax almost looking like a
method lookup (comparable to the self
object in Python).
But if you go deeply into a study of functional languages, you start encountering bizarre words such as “monad” and “functor”. Again, this is less true in Rescript, which emphasizes pragmatism over purity. Indeed, if you search the rescript documentation for “monad”, it will come up blank, and if you search for “Functor” you get a fairly short section that almost feels like an afterthought.
Functors are not a core feature of Rescript, but they are a useful abstraction that I wanted to understand well enough to explain them clearly.
Patreon
This article is part of a series on Rescript programming, though it is another stand-alone article with no hard dependencies on my earlier tutorials.
This series takes a lot of time and dedication to write and maintain. The main thing that has kept me invested in this writing this series is support from my Patrons.
Other ways to show support include sharing the articles on social media, commenting on them here, or a quick thank you on the Rescript forum.
Other articles in series
With over a dozen articles and counting, I’ve created a table of contents listing all the articles in this series in reading order.
But first, let’s think about generic types
Reading the Rescript docs, a Functor is “a function that accepts a module and returns a module”. When I first read this, I assumed it was somewhat analogous to a class decorator in Python, which usually accepts a class and returns a slightly modified version of that class.
But as I dug deeper, I discovered that a functor is more about types, and the fact that it’s operating on modules is kind of an afterthought.
So in order to reason about functors, let us first reason about types. Rescript
already has a concept of generic types. We use them, for example, in generic
collections (e.g array<string>
vs array<int>
), and you can define your
own generic records as follows:
type keyValuePair<'a, 'b> = {
name: 'a,
value: 'b,
}
This type is generic across two types, labelled 'a
and 'b
. You can generate
an infinite number of specific types from this generic type. For example, you
could instantiate a version of thys type that accepts a string and an integer:
let hundredsOfApples: keyValuePair<string, int> = {
name: "apple",
value: 480,
}
You can also create a specific type from a generic type:
type twoStrings = keyValuePair<string, string>
And you can even create a new generic type from an existing generic type. This one only has one type parameter:
type stringValuePair<'a> = keyValuePair<string, 'a>
Note that I intentionally abused 'a
above for pedagogical purposes. This 'a
is completely different from the 'a
in type keyValuePayr<'a, 'b>
. In fact,
this 'a
is being assigned to the 'b
in keyValuePair
! In both cases, ‘`a’
just means “some type that I don’t care what it is right now because I am generic”.
Incidentally, hundredsOfApples
is a valid stringValuePair<int>
.
Adding functions to the mix
You can pass a generic type into a function and have it return a generic type as the response. For example, this is a valid Rescript function:
let extractValue: keyValuePair<'a, 'b> => 'b = pair => pair.value
so is this:
let valueIfName: (keyValuePair<string, 'b>, string) => option<'b> = (pair, nameToMatch) => {
if pair.name == nameToMatch {
Some(pair.value)
} else {
None
}
}
In the above example, note that the return value is a new generic type (an
option
, which is built-in to Rescript), and it’s value must be the same type
as the value
of the pair that is passed in, defined as: 'b
.
You can make a generic function in Rescript where the argument itself is generic like this:
let showAndReturn: 'a => 'a = argument => {
Js.log(argument)
argument
}
This function will work by passing in any type. However, there is no way to say you want to specialize that function. For example, this is valid Typescript (note the last line):
// Typescript, not Rescript
function showAndReturn<atype>(argument: atype): atype {
console.log(argument)
return argument
}
let m = showAndReturn<string>("hello");
The advantage of the typescript function over Rescript is that you can specialize the type when you call it, such that the following is a compile-time error:
// Still typescript, won't compile
let m = showAndReturn<int>("hello");
There is no direct analogue in Rescript, that I know of. However, functors (we’ll get to them eventually, I promise) are one way to provide that generic typing.
Let’s think about a database
When we first started this discussion, we noted that a functor is “a function that accepts a module and returns a module.” But I don’t find that a very useful description. Instead, I think of a functor as a “generically typed module.” This is a pretty limited definition compared to the entirety of things a functor can do, but it is probably the definition that is used most often.
Let’s set the stage by using the same example I used in my previous article on poly variants: a database-like store that only allows certain types to be inserted into certain tables.
In our example, we’ll have three separate types for storing friends
, dogs
,
and books
. Each of these has a different structure:
type friend = {
name: string,
age: int,
}
type dog = {
name: string,
colour: string,
}
type book = {
title: string,
author: string,
}
Each of these will go in the database in separate tables. What I’m looking for is a function that would be called like this:
add("dogs", {name: "lassie", colour: "red and white"})
or
add("friends", {name: "ed", age: 14})
but not (with a compile-time error)
add("books", {"name": "ed", age: 14})
One less-than-desirable way we could implement this is by having a module that
exposes addDog
, addFriend
, and addBook
but not the general add
function, as follows:
module type DatabaseInterface = {
let addDog: dog => unit
let addFriend: friend => unit
let addBook: book => unit
}
module Database: DatabaseInterface = {
@module("database") external add: (string, 'a) => unit = "add"
let addDog = dog => add("dogs", dog)
let addFriend = friend => add("friends", friend)
let addBook = book => add("books", book)
}
Database.addDog({name: "lassie", colour: "red and white"})
The add
external binds to some mythical database function that accepts any
string and any object. However, that function is not visible “outside” the
Database
module because add
is not part of the DatabaseInterface
type.
This means you can’t accidentally or maliciously call Database.add("books", {name: "ed", age: 14})
to add a person to the books database. (Something that
is all too easy in Javascript!)
This is all great; I can start using this interface and everything will be strongly typed. However, there are a few problems:
- Every time I want to add a new function to the database module (e.g
put
orget
orbulkAdd
…) I have to add a separate call for each table in the database. - Similarly, every time I want to add a new table to the database model, I have
to add a separate call for each of those functions in the
Database
module. - It’s not possible to create generic bindings to a specific table. This whole
project started when I was trying to model the exceptional
Dexie database in Rescript. If I publish Dexie bindings
for your consumption, it won’t be much good to you if the only tables it can
handle are
friend
,dog
, andbooks
!
So what we really want to do is kind of generate separate copies of the
Database
module for each of the types that you might want to model. and that
is where functors come in (finally, I know).
Functors
If we look at our current DatabaseInterface
type, we can see two pieces that
vary: the name of the table and the type of the value that can go into that
table. Let’s start our exploration of functors by creating a SchemaItem
module type that models both:
type schemaId = string
module type SchemaItem = {
type t
let tableName: schemaId
}
I put the schemaId
type so you are less likely to call it with the wrong kind
of string, but it’s just a string.
We can now create three separate instances of this type for our three tables:
module FriendSchema = {
type t = friend
let tableName = "friends"
}
module DogSchema = {
type t = dog
let tableName = "dogs"
}
module BookSchema = {
type t = book
let tableName = "books"
}
Note that you can make a separate set of SchemaItem
modules for your own
database; perhaps you are tracking aircraft or tv shows instead (though this
clearly suggests you don’t have enough friends, dogs, or books in your life).
Tip: You might be tempted to type these modules like this:
// This won't work
module FriendSchema: SchemaItem = {
type t = friend
let tableName = "friends"
}
However, this over-specifies the type such that Friend.t
refers to SchemaItem.t
which is actually not a friend
at all. I got hung up on this for a while before
I finally got my functor to work.
Next up, we need a way to generate a version of the original Database
module
given one of these schemas. Effectively, we need a generic Database
module
that has a some specific instance of the SchemaItem
module as its generic
type.
In completely different words, in fact, what we need is… you guessed it, “a function that accepts a module and returns a module”.
The syntax for a functor is kind of like a mashup of the syntax for defining a
module and the syntax for defining a function. The functor “signature” looks
exactly like a function call with parameters and a fat arrow, except that the
name must be capitalized and you use a module
keyword instead of a let
to do the assigning:
module MakeSchema = (Schema: SchemaItem) => {}
In this case, the “argument” is another module, which must be an instance of
the SchemaItem
type, but is otherwise completely generic. You cannot pass
arguments into a functor that are not modules, but you can pass multiple
modules into one.
The contents of the functor body (the {}
) are more like a module definition
than a function. Instead of the last executed expression being the “return
value” of the functor, the entire body is the returned module. It contains type
and let definitions for an entire new module that is (effectively) “generated”
on the fly. Here’s one that specifies a couple of simple functions:
module MakeSchema = (Schema: SchemaItem) => {
let add: Schema.t => unit = item => Database.add(Schema.tableName, item)
let get: int => option<Schema.t> = id => Database.get(id)
}
'''
This presupposes the existince of a `Database` module that has bindings
something like this:
```reason
module Database = {
@module("database") external add: (string, 'a) => unit = "add"
@module("database") external get: int => 'a = "get"
}
Note: the astute reader may be thinking “But that Database module would allow me to put the wrong type in any table” (either through malice or if your brain overheated). You’re right, astute reader. We’ll fix it in a bit.
The MakeSchema
functor “returns” a module that has two functions, add
and
get
defined on it. We can now “call” this functor with our three SchemaItem
instances to generate three different modules that have those two functions:
module Friends = MakeSchema(FriendSchema)
module Dogs = MakeSchema(DogSchema)
module Books = MakeSchema(BookSchema)
Now we can write something like:
Friends.add({name: "lizzi", age: 8})
with correct type safety, as we’ll get a compile-time error if we try to add a book instead:
// compile error
Friends.add({title: "Pyramids", author: "Terry Pratchett"})
Hiding the implementation
That pesky Database
module still exists, meaning we can subvert the whole
type system as follows:
Database.add("friends", {title: "Pyramids", author: "Terry Pratchett"})
We need to find a way to hide that little detail. We can start by moving
Database
into the MakeSchema
functor:
module MakeSchema = (Schema: SchemaItem) => {
module Database = {
@module("database") external add: (string, 'a) => unit = "add"
@module("database") external get: int => 'a = "get"
}
let add = (item: Schema.t) => Database.add(Schema.tableName, item)
let get = id => Database.get(id)
}
However, this just moves the problem down a level; we can still do this, which seems even more confusing:
Friends.Database.add("friends", {title: "Pyramids", author: "Terry Pratchett"})
The solution is to add a module type to the functor, just as we did for the
DatabaseInterface
earlier:
module type MakeSchemaType = (Schema: SchemaItem) =>
{
let add: Schema.t => unit
let get: int => option<Schema.t>
}
module MakeSchema: MakeSchemaType = (Schema: SchemaItem) => {
module Database{...}
...
}
Now Friends.Database
is not accessible and there is no way to accidentally
add a book.
The input modules
I have been thinking of the module that is passed into the functor as being
little more than a collection of related types and maybe a few constant
values as with the tableName
above.
But there are actually no constraints on that input module. It can have all the features of any other module, including functions that can be called and nested modules. A great example is the one given in the Rescript documentation:
module type Comparable = {
type t
let equal: (t, t) => bool
}
module MakeSet = (Item: Comparable) => {
// let's use a list as our naive backing data structure
type backingType = list<Item.t>
let empty = list{}
let add = (currentSet: backingType, newItem: Item.t): backingType =>
// if item exists
if List.exists(x => Item.equal(x, newItem), currentSet) {
currentSet // return the same (immutable) set (a list really)
} else {
list{
newItem,
...currentSet // prepend to the set and return it
}
}
}
In this example, the Comparable
module provides a “generic” equal
function
that is called inside the MakeSet
functor to check for containment.
In Closing
Hopefully you understand functors a little better now (I know I do). They are a very powerful language feature, and like all powerful language features, they can be overused and abused. I think they are most useful when writing libraries or frameworks than in regular application development. Typically, Rescript is all about the Javascript bindings, though this will likely change as more and more pure-rescript libraries are added to the ecosystem. If you are writing such a library, you’ll likely need functors at some point. If not, hopefully they are a useful tool to have in your back pocket for that time when generic types just aren’t quite enough.