Parsing JSON in Rescript

Introduction

This is part of an ongoing series about the Rescript programming language. I’ve been building a toy “recipe book” progressive web app. The app is largely functional now, and I’m using explorations of some third party libraries to clean it up a bit. In this edition, we discuss parsing JSON.

Patreon

This series takes a lot of time and dedication to write and maintain. The main thing that has kept me invested in this writing this series is support from my Patrons.

Other ways to show support include sharing the articles on social media, commenting on them here, or a quick thank you on the Rescript forum.

Getting our bearings

This article will improve the existing json parsing in the express portion of the rescript-express-recipes. You can git checkout the zora-express branch if you want to start from the same place I am in this article.

If you want to follow along with the changes in this article, I try to make a separate commit for each section. You can find them in the jzon branch.

Before we go there, though, I want to make an update to make node 16 a little happier. A few articles back, I changed the package.json to use "type": "module", and made the bsconfig build modules in "module": "es6" format.

Node 16 can’t figure out whether an import should happen in commonjs or es6 format without a little help (it used to just attempt it blindly). The easiest way to supply that help is to make all the files have .mjs extensions instead of .bs.js.

So go ahead and change the suffix in bsconfig.json to .mjs.

Rescript might need a little massaging to pick up the change in the dependencies, though. Try running npx rescript build -with-deps and if that doesn’t work, do the ever-so-popular dance of blowing away your node modules, running npm install and then npx rescript build -with-deps again.

Why JSON is hard

JSON is a beautiful and simple data exchange format, but it has a serious flaw: it’s schemaless by default. When you receive a JSON object from an external source, you have absolutely no guarantees about the content or format of that object. You aren’t even guaranteed that it is valid JSON.

While tools like json-schema can help make communication about the json easier, they don’t solve the fundamental question of “is this valid data”.

When working in an untyped language, you can get away (sort of) with not rigorously checking every field on a json object. You’ll end up with some nasty runtime errors if you don’t do it right, but you can get away with it. You know, in development when you are just prototyping everything and of course you would never push that code to prod, but there is a deadline and really, how bad can it be…

So in a strongly typed language, you need to take that “unknown” object and convert it into an object with known types, basically one field at a time. If any of the types don’t line up, you need to handle it (often returning 400 bad request, but possibly setting a default or otherwise working around it).

So far in this project, I’ve used the Js.Json API in the rescript standard library to do all my JSON parsing. This API is pretty low-level and requires a lot of boilerplate to do anything useful. You basically have to manually look up every field on the object and check first if it exists and then if it contains valid data in the expected format. This guarantees that when you are done you have the object you expect, but it is hard to write and harder to read.

I won’t say it’s error-prone, however, because the compiler tells me about all the errors in advance!

A better parser?

There are quite a few JSON parsers in the older bucklescript and Reason ecosystems. Most don’t appear to be maintained anymore, and they tend to use outdated apis that don’t really fit with modern rescript development.

One newer project is jzon. It specifically targets Rescript syntax and has ultra-low boilerplate. My only concern with the library is that it is quite new and there’s no indication as to whether it will be supported in the long run (It’s already been six months since it’s last release).

Then again, nothing else appears to be either!

Installing the dependency

As with all rescript dependencies, installing it is a two step process:

npm install --save rescript-jzon

and then update the bs-dependencies in your bsconfig.json to include "rescript-jzon".

Our first codec

At first, I was hoping I could just map the json schema we are already using to the types in Store.res. Unfortunately, most of our payloads are actually not exactly those types. So instead, I’ll define payloads for the individual functions in Controller.res. (Don’t forget to run npm test so that zora can let you know when it’s working again!)

Let’s start with the request payload type for the addRecipe body. It is supposed to be a json dict with three string fields: title, ingredients, and instructions:

type addRecipeInput = {
  title: string,
  ingredients: string,
  instructions: string,
}

(I put this in Controllers.res just before the addRecipe endpoint.)

This is just a standard Rescript record; nothing too exciting. The next step is to configure Jzon to work with this type. We basically need to provide Jzon with a couple of functions to convert objects to an intermediate format and back again, and a schema for the three fields:

let addRecipeInputCodec = Jzon.object3(
  ({title, ingredients, instructions}) => (title, ingredients, instructions),
  ((title, ingredients, instructions)) =>
    {
      title: title,
      ingredients: ingredients,
      instructions: instructions,
    }->Ok,

  Jzon.field("title", Jzon.string),
  Jzon.field("ingredients", Jzon.string),
  Jzon.field("instructions", Jzon.string),
)

Let’s break that down a bit. First the variable name addRecipeInputCodec is telling us this is the way to encode or decode a addRecipeInput type to json.

The Jzon.object3 function call says “we want to create a json object with three fields”. Rescript doesn’t have language support for variadic arguments (other than binding to Javascript functions that use them), so you occasionally see odd naming overrides like this. They are fairly common in Rescript-React for things like the dependency array in a useEffect hook, for example.

The first argument is a simple function to convert an object to an intermediary tuple format that jzon knows how to work with.

The second argument does the opposite; it’s a function to convert the intermediary format to an object. This one returns a result, since the conversion may not go well for more complicated functions. In this case, we just wrap it in Ok.

The remaining three arguments inform Jzon what type to assign to the three (remember, object3) fields. in this case, they are all strings.

Converting results to options

One problem I had is that bs-express is returning an Option for invalid json, wheras jzon much more sensibly works with Results. Because we are dealing with several such options, I created the following function to convert from an Option to a Result with an informative error message:

let jsonResult = o => o->Belt.Option.mapWithDefault(Error(#SyntaxError("Invalid JSON")), s => Ok(s))

The #SyntaxError polymorphic type is meant to cooperate with the same polymorphic type that is returned by Jzon methods. This makes it easy to flatMap with any Results from Jzon.

This probably belongs in a utility library, but I just put it at the top of Controllers.res for now.

Using the decoder

addRecipe accepts arbitrary Json and needs to create a addRecipeInput from it. This is easily done with Jzon. Replace the entire contents of addRecipe with the following:

let addRecipe = bodyOption => {
  let jsonBodyOption =
    bodyOption->jsonResult->Belt.Result.flatMap(j => addRecipeInputCodec->Jzon.decode(j))

  let jsonResponse = Js.Dict.empty()

  switch jsonBodyOption {
  | Ok({title, instructions, ingredients}) => {
      let id = Store.uuid()
      Store.Reducer.dispatch(
        AddRecipe({id: id, title: title, ingredients: ingredients, instructions: instructions}),
      )
      jsonResponse->Js.Dict.set("id", id->Js.Json.string)
    }
  | _ => jsonResponse->Js.Dict.set("error", "missing attribute"->Js.Json.string)
  }

  jsonResponse->Js.Json.object_
}

The bulk of the change is the replacement of a massive pipeline at the beginning with a fairly short pipeline that fits all on one line. The incoming bodyOption may contain None or Some(json). We use flatMap to convert (if it is Some) the json into a addRecipeInput by passing it through Jzon.decode for the specified codec.

The rest of the change is to switch up the switch statement to work with a Result<addRecipeInput> instead of a Some(crazy tuple of Some)s.

Writing a couple encoders

The function looks a little better already, but it’s still doing a lot of Js.Json.* manipulation for the return value it creates. Let’s create some jzon encoders for these.

This method can return two possible things, either a {"id": string} or a {"error": string}. The latter is actually used in several other functions as well so let’s start with that:

type errorResult = {error: string}

let errorResultCodec = Jzon.object1(
  ({error}) => error,
  error => {error: error}->Ok,
  Jzon.field("error", Jzon.string),
)

Once again, we create a simple type for the record. The Codec accepts one function that converts an instance of that type to the intermediate representation and a second function that does the opposite. In this case, the intermediate representation is not a tuple (since you can’t have single-element tuples in Rescript), but just the error value itself. The last argument tells jzon exactly what field this single-field object has: error with type string.

The type and codec for the successful case is virtually identical:

type addRecipeSuccess = {id: string}

let addRecipeSuccessCodec = Jzon.object1(
  ({id}) => id,
  id => {id: id}->Ok,
  Jzon.field("id", Jzon.string),
)

Now we can simplify the addRecipe function further. It should never have to construct a jsonResponse dict. The switch statement that matches on Ok or Error can just pass the value through Jzon.encode and let it return directly:

let addRecipe = bodyOption => {
  let jsonBodyOption =
    bodyOption->jsonResult->Belt.Result.flatMap(j => addRecipeInputCodec->Jzon.decode(j))

  switch jsonBodyOption {
  | Ok({title, instructions, ingredients}) => {
      let id = Store.uuid()
      Store.Reducer.dispatch(
        AddRecipe({id: id, title: title, ingredients: ingredients, instructions: instructions}),
      )
      addRecipeSuccessCodec->Jzon.encode({id: id})
    }
  | _ => errorResultCodec->Jzon.encode({error: "missing attribute"})
  }
}

There is one more thing jzon can do for us. In the original code, we always returned "missing attribute" as the error message because we didn’t have any additional information. However, the Error that jzon gives us is more informative for the user.

So, replace the second _ arm of the switch statement with one that extracts the error as follows:

  | Error(error) => errorResultCodec->Jzon.encode({error: error->Jzon.DecodingError.toString})

After making this change, I knew that it worked because my tests failed. Instead of “missing attribute” it is telling the user “missing field “title”. That’s much more useful! I updated the test addRecipe missing attribute function (in TestController.test.res) so the expected value matches what jzon is providing:

    let expected = `{"error":"Missing field \\\"title\\\" at ."}`

That’s basically all there is to jzon. You define the codec with the schema you require and provide functions to convert to and from that schema. Things can get messier when your schema contains nested objects and the like, but you can pass the encoding down to other encoders as needed. It’s definitely much cleaner than trying to do raw Js.Json coversions, and much safer than arbitrarily binding types to JSON.parse

Another example (addTagToRecipe)

The addTagToRecipe function requires a new type and encoder for its input:

type addTagToRecipe = {
  recipeId: string,
  tag: string,
}

let addTagToRecipeInputCodec = Jzon.object2(
  ({recipeId, tag}) => (recipeId, tag),
  ((recipeId, tag)) =>
    {
      recipeId: recipeId,
      tag: tag,
    }->Ok,
  Jzon.field("recipeId", Jzon.string),
  Jzon.field("tag", Jzon.string),
)

It also needs a new type for the success message. The success message is an object with one field (like the addRecipe success message), but that one field is a boolean. This is a type that might be used in multiple endpoints, so I’m titling it generic:

type genericSuccess = {success: bool}

let genericSuccessCodec = Jzon.object1(
  ({success}) => success,
  success => {success: success}->Ok,
  Jzon.field("success", Jzon.bool),
)

Now, this is kind of a redundant message, and some might argue the endpoint should just return 200 Ok with no body. I’m going to leave it as is, though, because I don’t want the controller layer to have to know anything about status codes.

We don’t need to define an error type for this function because it’s the same type we already defined for addRecipe. Here’s the new addTagToRecipe function:

let addTagToRecipe = bodyOption => {
  let jsonBodyOption =
    bodyOption->jsonResult->Belt.Result.flatMap(j => addTagToRecipeInputCodec->Jzon.decode(j))

  switch jsonBodyOption {
  | Ok({recipeId, tag}) =>
    switch Store.Reducer.getState().recipes->Belt.Map.String.get(recipeId) {
    | Some(recipe) => {
        Store.Reducer.dispatch(AddTag({recipeId: recipe.id, tag: tag}))
        genericSuccessCodec->Jzon.encode({success: true})
      }
    | None => errorResultCodec->Jzon.encode({error: "recipe does not exist"})
    }
  | Error(error) => errorResultCodec->Jzon.encode({error: error->Jzon.DecodingError.toString})
  }
}

The function has a new nested switch because we were previously checking for the existence of the recipe in the code that was extracting the json from the body. This might seem like a readability regression, but it comes with the advantage of being able to explicitly inform the reader that the recipe does not exist, rather than just saying “invalid request”, which is not at all helpful (or event true!).

A slightly more complicated example (getRecipe)

The getRecipe params arrive as a json dict containing a single field, “id”. But we’ve already defined a codec for this! We foolishly named it addRecipeSuccess, so let’s first change that to genericId:

type genericId = {id: string}

let genericIdCodec = Jzon.object1(({id}) => id, id => {id: id}->Ok, Jzon.field("id", Jzon.string))

This allows us to simplify the param retrieval in getRecipe from:

  let recipeOption =
    params
    ->Js.Dict.get("id")
    ->Belt.Option.flatMap(Js.Json.decodeString)
    ->Belt.Option.flatMap(id => state.recipes->Belt.Map.String.get(id))

into the (longish) one-liner:

  let recipeResult = genericIdCodec->Jzon.decode(params->Js.Json.object_)

There’s a funny jump in there where we pass params into Js.Json.object_ so it is a type that Jzon.decode understands, but otherwise it’s short and sweet.

Now let’s turn our attention to the output. The interesting thing about this endpoint is that the outputs are json representations of types that already exist: the recipe type in the store. Technically that type has two fields we are not currently returning from the endpoint ( updatedAt and deleted), but we probably should be returning the whole data, so let’s encode it as such.

This means we don’t have to define a new type for the output, but we do need to create a codec for the existing type:

let recipeCodec = Jzon.object7(
  ({id, title, ingredients, instructions, tags, updatedAt, deleted}: Store.recipe) => (
    id,
    title,
    ingredients,
    instructions,
    tags,
    updatedAt,
    deleted,
  ),
  ((id, title, ingredients, instructions, tags, updatedAt, deleted)) =>
    {
      Store.id: id,
      title: title,
      ingredients: ingredients,
      instructions: instructions,
      tags: tags,
      updatedAt: updatedAt,
      deleted: deleted,
    }->Ok,
  Jzon.field("id", Jzon.string),
  Jzon.field("title", Jzon.string),
  Jzon.field("ingredients", Jzon.string),
  Jzon.field("instructions", Jzon.string),
  Jzon.field("tags", Jzon.array(Jzon.string)),
  Jzon.field("updatedAt", Jzon.float),
  Jzon.field("deleted", Jzon.bool),
)

This is kinda long because of the seven fields, but we can now use this encoder to convert any Store.recipe to a json object and back anywhere in our code. One thing to notice is the way the array of tags is treated. There is a Jzon.array function that accepts the type of the things in that array. That type can even be some other Codec you have defined yourself. Also notice how I typed the input as Store.recipe in the function in the first argument, and the output using Store.id in the second output so that Rescript knows what type is supposed to be output.

Now we can update getRecipe as follows:

let getRecipe = params => {
  let state = Store.Reducer.getState()
  let recipeResult = genericIdCodec->Jzon.decode(params->Js.Json.object_)
  switch recipeResult {
  | Ok({id}) =>
    switch state.recipes->Belt.Map.String.get(id) {
    | Some(recipe) => recipeCodec->Jzon.encode(recipe)
    | None => errorResultCodec->Jzon.encode({error: "unable to find that recipe"})
    }
  | Error(error) => errorResultCodec->Jzon.encode({error: error->Jzon.DecodingError.toString})
  }
}

This is a bit shorter than the original because we don’t have to independently encode each of the fields in the function. It also allows us to send error messages that distinguish between invalid json and a missing recipe, as we discussed with the addTagToRecipe function.

Now, this change breaks the controller unit tests. Those tests were comparing the output to a json string that is now missing the deleted and updated at fields we just added. The obvious fix of just updating the json encoded string won’t work because updatedAt changes on every invocation. I could write a binding to something like sinon.useFakeTimers or manually mock the Date constructor, but I find such mocks generally cause more harm than good.

Instead, I’m going load the result into a recipe using our fancy new codec (I knew it would come in handy!) and compare the values directly. This is how I would have liked to code it in the first place, but I didn’t want to copy all the decoding logic into my unit test. Now I don’t have to worry about that because Jzon and my codec are taking care of that part.

This requires more lines of code, but it actually makes the tests more readable, which is of paramount importance. Consider the original test code that is now failing:

    let json = result->Js.Json.stringifyAny->Belt.Option.getUnsafe
    let expected = `{"id":"${id}","title":"Bread","ingredients":"Flour and Water","instructions":"Mix and Bake","tags":[]}`
    t->equal(json, expected, "get recipe should match input")

And compare it to the new version:

    let actual = Controller.recipeCodec->Jzon.decode(result)->Belt.Result.getExn
    t->equal(actual.id, id, "should have same ids")
    t->equal(actual.title, "Bread", "should same title")
    t->equal(actual.ingredients, "Flour and Water", "have same ingredients")
    t->equal(actual.instructions, "Mix and Bake", "have same instructions")
    t->equal(actual.deleted, false, "should not be deleted")
    t->equal(actual.tags->Belt.Array.length, 0, "Should not have any tags")

Which of those would you rather debug if you got a test failure?

The unit test calls getRecipe again after adding the tag, so you’ll need to change a second location to look like this:

    let actual = Controller.recipeCodec->Jzon.decode(result)->Belt.Result.getExn
    t->equal(actual.id, id, "should have same ids")
    t->equal(actual.title, "Bread", "should same title")
    t->equal(actual.ingredients, "Flour and Water", "have same ingredients")
    t->equal(actual.instructions, "Mix and Bake", "have same instructions")
    t->equal(actual.deleted, false, "should not be deleted")
    t->equal(actual.tags->Belt.Array.length, 1, "Should have one tag")
    t->equal(actual.tags->Belt.Array.getUnsafe(0), "Carbs", "First tag should be carbs")

Conclusion

Thus ends my introduction to jzon. I don’t think I’ve reduced the total amount of code in my project at all because I’m only using each type of json in a couple places at most. But it has simplified the individual functions, and on a larger and more realistic project, there would be a lot more reuse of the various encodings.

Overall, I think that jzon is a neat little library with a pleasant API. I didn’t dive into the problems of deeply nested objects here, and there may be a point where it is better to bind to a Javascript json-schema library. However, most json use cases are fairly flat (for the developer’s sanity, if nothing else!), and jzon is highly suitable for that.

I think I’m done with this recipes app for now. It’s taught us a lot about Rescript and about several interesting technologies. But I want to move onto exploring some more advanced Rescript concepts.