Parsing JSON in Rescript
Introduction
This is part of an ongoing series about the Rescript programming language. I’ve been building a toy “recipe book” progressive web app. The app is largely functional now, and I’m using explorations of some third party libraries to clean it up a bit. In this edition, we discuss parsing JSON.
Patreon
This series takes a lot of time and dedication to write and maintain. The main thing that has kept me invested in this writing this series is support from my Patrons.
Other ways to show support include sharing the articles on social media, commenting on them here, or a quick thank you on the Rescript forum.
Other articles in series
With over a dozen articles and counting, I’ve created a table of contents listing all the articles in this series in reading order.
Getting our bearings
This article will improve the existing json parsing in the express portion of
the
rescript-express-recipes.
You can git checkout
the
zora-express
branch if you want to start from the same place I am in this article.
If you want to follow along with the changes in this article, I try to make a separate commit for each section. You can find them in the jzon branch.
Before we go there, though, I want to make an update to make node 16 a little happier.
A few articles back, I changed the package.json
to use "type": "module"
, and
made the bsconfig build modules in "module": "es6"
format.
Node 16 can’t figure out whether an import should happen in commonjs or es6 format
without a little help (it used to just attempt it blindly). The easiest way to supply
that help is to make all the files have .mjs
extensions instead of .bs.js
.
So go ahead and change the suffix
in bsconfig.json
to .mjs
.
Rescript might need a little massaging to pick up the change in the
dependencies, though. Try running npx rescript build -with-deps
and if that
doesn’t work, do the ever-so-popular dance of blowing away your node modules,
running npm install
and then npx rescript build -with-deps
again.
Why JSON is hard
JSON is a beautiful and simple data exchange format, but it has a serious flaw: it’s schemaless by default. When you receive a JSON object from an external source, you have absolutely no guarantees about the content or format of that object. You aren’t even guaranteed that it is valid JSON.
While tools like json-schema can help make communication about the json easier, they don’t solve the fundamental question of “is this valid data”.
When working in an untyped language, you can get away (sort of) with not rigorously checking every field on a json object. You’ll end up with some nasty runtime errors if you don’t do it right, but you can get away with it. You know, in development when you are just prototyping everything and of course you would never push that code to prod, but there is a deadline and really, how bad can it be…
So in a strongly typed language, you need to take that “unknown” object and convert it into an object with known types, basically one field at a time. If any of the types don’t line up, you need to handle it (often returning 400 bad request, but possibly setting a default or otherwise working around it).
So far in this project, I’ve used the Js.Json API in the rescript standard library to do all my JSON parsing. This API is pretty low-level and requires a lot of boilerplate to do anything useful. You basically have to manually look up every field on the object and check first if it exists and then if it contains valid data in the expected format. This guarantees that when you are done you have the object you expect, but it is hard to write and harder to read.
I won’t say it’s error-prone, however, because the compiler tells me about all the errors in advance!
A better parser?
There are quite a few JSON parsers in the older bucklescript and Reason ecosystems. Most don’t appear to be maintained anymore, and they tend to use outdated apis that don’t really fit with modern rescript development.
One newer project is jzon. It specifically targets Rescript syntax and has ultra-low boilerplate. My only concern with the library is that it is quite new and there’s no indication as to whether it will be supported in the long run (It’s already been six months since it’s last release).
Then again, nothing else appears to be either!
Installing the dependency
As with all rescript dependencies, installing it is a two step process:
npm install --save rescript-jzon
and then update the bs-dependencies
in your bsconfig.json
to include
"rescript-jzon"
.
Our first codec
At first, I was hoping I could just map the json schema we are already using to
the types in Store.res
. Unfortunately, most of our payloads are actually not
exactly those types. So instead, I’ll define payloads for the individual
functions in Controller.res
. (Don’t forget to run npm test
so that zora can
let you know when it’s working again!)
Let’s start with the request payload type for the addRecipe
body. It is
supposed to be a json dict with three string fields: title
, ingredients
,
and instructions
:
type addRecipeInput = {
title: string,
ingredients: string,
instructions: string,
}
(I put this in Controllers.res
just before the addRecipe
endpoint.)
This is just a standard Rescript record; nothing too exciting. The next step is to configure Jzon to work with this type. We basically need to provide Jzon with a couple of functions to convert objects to an intermediate format and back again, and a schema for the three fields:
let addRecipeInputCodec = Jzon.object3(
({title, ingredients, instructions}) => (title, ingredients, instructions),
((title, ingredients, instructions)) =>
{
title: title,
ingredients: ingredients,
instructions: instructions,
}->Ok,
Jzon.field("title", Jzon.string),
Jzon.field("ingredients", Jzon.string),
Jzon.field("instructions", Jzon.string),
)
Let’s break that down a bit. First the variable name addRecipeInputCodec
is telling
us this is the way to encode or decode a addRecipeInput
type to json.
The Jzon.object3
function call says “we want to create a json object with
three fields”. Rescript doesn’t have language support for variadic arguments
(other than binding to Javascript functions that use them), so you occasionally
see odd naming overrides like this. They are fairly common in Rescript-React
for things like the dependency array in a useEffect
hook, for example.
The first argument is a simple function to convert an object to an intermediary tuple format that jzon knows how to work with.
The second argument does the opposite; it’s a function to convert the intermediary format to an object. This one returns a result, since the conversion may not go well for more complicated functions. In this case, we just wrap it in Ok.
The remaining three arguments inform Jzon what type to assign to the
three (remember, object3
) fields. in this case, they are all strings.
Converting results to options
One problem I had is that bs-express is returning an Option
for invalid json,
wheras jzon
much more sensibly works with Result
s. Because we are dealing with
several such options, I created the following function to convert from an Option
to a Result
with an informative error message:
let jsonResult = o => o->Belt.Option.mapWithDefault(Error(#SyntaxError("Invalid JSON")), s => Ok(s))
The #SyntaxError
polymorphic type is meant to cooperate with the same polymorphic type
that is returned by Jzon methods. This makes it easy to flatMap
with any Result
s
from Jzon.
This probably belongs in a utility library, but I just put it at the top of
Controllers.res
for now.
Using the decoder
addRecipe
accepts arbitrary Json and needs to create a addRecipeInput
from it.
This is easily done with Jzon. Replace the entire contents of addRecipe
with
the following:
let addRecipe = bodyOption => {
let jsonBodyOption =
bodyOption->jsonResult->Belt.Result.flatMap(j => addRecipeInputCodec->Jzon.decode(j))
let jsonResponse = Js.Dict.empty()
switch jsonBodyOption {
| Ok({title, instructions, ingredients}) => {
let id = Store.uuid()
Store.Reducer.dispatch(
AddRecipe({id: id, title: title, ingredients: ingredients, instructions: instructions}),
)
jsonResponse->Js.Dict.set("id", id->Js.Json.string)
}
| _ => jsonResponse->Js.Dict.set("error", "missing attribute"->Js.Json.string)
}
jsonResponse->Js.Json.object_
}
The bulk of the change is the replacement of a massive pipeline at the
beginning with a fairly short pipeline that fits all on one line. The incoming
bodyOption
may contain None
or Some(json)
. We use flatMap
to convert
(if it is Some
) the json into a addRecipeInput
by passing it through
Jzon.decode
for the specified codec.
The rest of the change is to switch up the switch
statement to work with a
Result<addRecipeInput>
instead of a Some(crazy tuple of Some)
s.
Writing a couple encoders
The function looks a little better already, but it’s still doing a lot of
Js.Json.*
manipulation for the return value it creates. Let’s create
some jzon encoders for these.
This method can return two possible things, either a {"id": string}
or a
{"error": string}
. The latter is actually used in several other functions
as well so let’s start with that:
type errorResult = {error: string}
let errorResultCodec = Jzon.object1(
({error}) => error,
error => {error: error}->Ok,
Jzon.field("error", Jzon.string),
)
Once again, we create a simple type for the record. The Codec accepts one
function that converts an instance of that type to the intermediate
representation and a second function that does the opposite. In this case, the
intermediate representation is not a tuple (since you can’t have single-element
tuples in Rescript), but just the error value itself. The last argument tells
jzon exactly what field this single-field object has: error
with type
string
.
The type and codec for the successful case is virtually identical:
type addRecipeSuccess = {id: string}
let addRecipeSuccessCodec = Jzon.object1(
({id}) => id,
id => {id: id}->Ok,
Jzon.field("id", Jzon.string),
)
Now we can simplify the addRecipe
function further. It should never have to
construct a jsonResponse dict. The switch statement that matches on Ok
or
Error
can just pass the value through Jzon.encode
and let it return
directly:
let addRecipe = bodyOption => {
let jsonBodyOption =
bodyOption->jsonResult->Belt.Result.flatMap(j => addRecipeInputCodec->Jzon.decode(j))
switch jsonBodyOption {
| Ok({title, instructions, ingredients}) => {
let id = Store.uuid()
Store.Reducer.dispatch(
AddRecipe({id: id, title: title, ingredients: ingredients, instructions: instructions}),
)
addRecipeSuccessCodec->Jzon.encode({id: id})
}
| _ => errorResultCodec->Jzon.encode({error: "missing attribute"})
}
}
There is one more thing jzon can do for us. In the original code, we always
returned "missing attribute"
as the error message because we didn’t have any
additional information. However, the Error
that jzon gives us is more
informative for the user.
So, replace the second _
arm of the switch statement with one that extracts
the error as follows:
| Error(error) => errorResultCodec->Jzon.encode({error: error->Jzon.DecodingError.toString})
After making this change, I knew that it worked because my tests failed.
Instead of “missing attribute” it is telling the user “missing field “title”.
That’s much more useful! I updated the test addRecipe missing attribute
function (in TestController.test.res
) so the expected value matches what jzon
is providing:
let expected = `{"error":"Missing field \\\"title\\\" at ."}`
That’s basically all there is to jzon. You define the codec with the schema you
require and provide functions to convert to and from that schema. Things can
get messier when your schema contains nested objects and the like, but you can
pass the encoding down to other encoders as needed. It’s definitely much
cleaner than trying to do raw Js.Json
coversions, and much safer than
arbitrarily binding types to JSON.parse
Another example (addTagToRecipe)
The addTagToRecipe
function requires a new type and encoder for its input:
type addTagToRecipe = {
recipeId: string,
tag: string,
}
let addTagToRecipeInputCodec = Jzon.object2(
({recipeId, tag}) => (recipeId, tag),
((recipeId, tag)) =>
{
recipeId: recipeId,
tag: tag,
}->Ok,
Jzon.field("recipeId", Jzon.string),
Jzon.field("tag", Jzon.string),
)
It also needs a new type for the success message. The success message is an object
with one field (like the addRecipe
success message), but that one field is a boolean.
This is a type that might be used in multiple endpoints, so I’m titling it generic:
type genericSuccess = {success: bool}
let genericSuccessCodec = Jzon.object1(
({success}) => success,
success => {success: success}->Ok,
Jzon.field("success", Jzon.bool),
)
Now, this is kind of a redundant message, and some might argue the endpoint
should just return 200 Ok
with no body. I’m going to leave it as is, though,
because I don’t want the controller layer to have to know anything about status
codes.
We don’t need to define an error
type for this function because it’s the same
type we already defined for addRecipe
. Here’s the new addTagToRecipe
function:
let addTagToRecipe = bodyOption => {
let jsonBodyOption =
bodyOption->jsonResult->Belt.Result.flatMap(j => addTagToRecipeInputCodec->Jzon.decode(j))
switch jsonBodyOption {
| Ok({recipeId, tag}) =>
switch Store.Reducer.getState().recipes->Belt.Map.String.get(recipeId) {
| Some(recipe) => {
Store.Reducer.dispatch(AddTag({recipeId: recipe.id, tag: tag}))
genericSuccessCodec->Jzon.encode({success: true})
}
| None => errorResultCodec->Jzon.encode({error: "recipe does not exist"})
}
| Error(error) => errorResultCodec->Jzon.encode({error: error->Jzon.DecodingError.toString})
}
}
The function has a new nested switch
because we were previously checking for
the existence of the recipe in the code that was extracting the json from the
body. This might seem like a readability regression, but it comes with the
advantage of being able to explicitly inform the reader that the recipe does
not exist, rather than just saying “invalid request”, which is not at all
helpful (or event true!).
A slightly more complicated example (getRecipe)
The getRecipe params arrive as a json dict containing a single field, “id”. But
we’ve already defined a codec for this! We foolishly named it
addRecipeSuccess
, so let’s first change that to genericId
:
type genericId = {id: string}
let genericIdCodec = Jzon.object1(({id}) => id, id => {id: id}->Ok, Jzon.field("id", Jzon.string))
This allows us to simplify the param retrieval in getRecipe
from:
let recipeOption =
params
->Js.Dict.get("id")
->Belt.Option.flatMap(Js.Json.decodeString)
->Belt.Option.flatMap(id => state.recipes->Belt.Map.String.get(id))
into the (longish) one-liner:
let recipeResult = genericIdCodec->Jzon.decode(params->Js.Json.object_)
There’s a funny jump in there where we pass params
into Js.Json.object_
so
it is a type that Jzon.decode
understands, but otherwise it’s short and
sweet.
Now let’s turn our attention to the output. The interesting thing about this
endpoint is that the outputs are json representations of types that already
exist: the recipe
type in the store. Technically that type has two fields we
are not currently returning from the endpoint ( updatedAt
and deleted
), but
we probably should be returning the whole data, so let’s encode it as such.
This means we don’t have to define a new type for the output, but we do need to create a codec for the existing type:
let recipeCodec = Jzon.object7(
({id, title, ingredients, instructions, tags, updatedAt, deleted}: Store.recipe) => (
id,
title,
ingredients,
instructions,
tags,
updatedAt,
deleted,
),
((id, title, ingredients, instructions, tags, updatedAt, deleted)) =>
{
Store.id: id,
title: title,
ingredients: ingredients,
instructions: instructions,
tags: tags,
updatedAt: updatedAt,
deleted: deleted,
}->Ok,
Jzon.field("id", Jzon.string),
Jzon.field("title", Jzon.string),
Jzon.field("ingredients", Jzon.string),
Jzon.field("instructions", Jzon.string),
Jzon.field("tags", Jzon.array(Jzon.string)),
Jzon.field("updatedAt", Jzon.float),
Jzon.field("deleted", Jzon.bool),
)
This is kinda long because of the seven fields, but we can now use this
encoder to convert any Store.recipe
to a json object and back anywhere in our
code. One thing to notice is the way the array of tags is treated. There is a
Jzon.array
function that accepts the type of the things in that array. That
type can even be some other Codec you have defined yourself. Also notice how I
typed the input as Store.recipe
in the function in the first argument, and
the output using Store.id
in the second output so that Rescript knows what
type is supposed to be output.
Now we can update getRecipe
as follows:
let getRecipe = params => {
let state = Store.Reducer.getState()
let recipeResult = genericIdCodec->Jzon.decode(params->Js.Json.object_)
switch recipeResult {
| Ok({id}) =>
switch state.recipes->Belt.Map.String.get(id) {
| Some(recipe) => recipeCodec->Jzon.encode(recipe)
| None => errorResultCodec->Jzon.encode({error: "unable to find that recipe"})
}
| Error(error) => errorResultCodec->Jzon.encode({error: error->Jzon.DecodingError.toString})
}
}
This is a bit shorter than the original because we don’t have to independently
encode each of the fields in the function. It also allows us to send error
messages that distinguish between invalid json and a missing recipe, as we
discussed with the addTagToRecipe
function.
Now, this change breaks the controller unit tests. Those tests were comparing
the output to a json string that is now missing the deleted
and updated at
fields we just added. The obvious fix of just updating the json encoded string
won’t work because updatedAt
changes on every invocation. I could write a
binding to something like sinon.useFakeTimers
or manually mock the Date
constructor, but I find such mocks generally cause more harm than good.
Instead, I’m going load the result into a recipe using our fancy new codec (I knew it would come in handy!) and compare the values directly. This is how I would have liked to code it in the first place, but I didn’t want to copy all the decoding logic into my unit test. Now I don’t have to worry about that because Jzon and my codec are taking care of that part.
This requires more lines of code, but it actually makes the tests more readable, which is of paramount importance. Consider the original test code that is now failing:
let json = result->Js.Json.stringifyAny->Belt.Option.getUnsafe
let expected = `{"id":"${id}","title":"Bread","ingredients":"Flour and Water","instructions":"Mix and Bake","tags":[]}`
t->equal(json, expected, "get recipe should match input")
And compare it to the new version:
let actual = Controller.recipeCodec->Jzon.decode(result)->Belt.Result.getExn
t->equal(actual.id, id, "should have same ids")
t->equal(actual.title, "Bread", "should same title")
t->equal(actual.ingredients, "Flour and Water", "have same ingredients")
t->equal(actual.instructions, "Mix and Bake", "have same instructions")
t->equal(actual.deleted, false, "should not be deleted")
t->equal(actual.tags->Belt.Array.length, 0, "Should not have any tags")
Which of those would you rather debug if you got a test failure?
The unit test calls getRecipe
again after adding the tag, so you’ll need to change
a second location to look like this:
let actual = Controller.recipeCodec->Jzon.decode(result)->Belt.Result.getExn
t->equal(actual.id, id, "should have same ids")
t->equal(actual.title, "Bread", "should same title")
t->equal(actual.ingredients, "Flour and Water", "have same ingredients")
t->equal(actual.instructions, "Mix and Bake", "have same instructions")
t->equal(actual.deleted, false, "should not be deleted")
t->equal(actual.tags->Belt.Array.length, 1, "Should have one tag")
t->equal(actual.tags->Belt.Array.getUnsafe(0), "Carbs", "First tag should be carbs")
Conclusion
Thus ends my introduction to jzon. I don’t think I’ve reduced the total amount of code in my project at all because I’m only using each type of json in a couple places at most. But it has simplified the individual functions, and on a larger and more realistic project, there would be a lot more reuse of the various encodings.
Overall, I think that jzon is a neat little library with a pleasant API. I didn’t dive into the problems of deeply nested objects here, and there may be a point where it is better to bind to a Javascript json-schema library. However, most json use cases are fairly flat (for the developer’s sanity, if nothing else!), and jzon is highly suitable for that.
I think I’m done with this recipes app for now. It’s taught us a lot about Rescript and about several interesting technologies. But I want to move onto exploring some more advanced Rescript concepts.