Type Safe Routing - Baby Steps

Type safe routing means different things to different people. So let’s start by clarifying the intended meaning in this post. There are 2 widely used definitions:

  • The first refers to the case where the type system is used to maintain hyper links in your web app so that there are no dangling links. This static checking also extends to the parameters accepted by the hyper links.

  • The second definition is processing URL’s in a way that assigns types to the values extracted.

The first definition is much stronger and is mostly useful for complete web apps. The second definition pertains to API’s exposed over HTTP and will be the subject of this blog post. I am not an expert on this subject and this post only documents my first steps as to how to solve this problem. The end solution that I propose is not quite satisfying but is elegant in its own right. Let’s begin.

The Problem

Let’s take a sample opium handler:

let h =
  put "/hello/:x/from/:y" begin fun req ->
    let (x,y) = (param req "x", param req "y") in
    let msg = sprintf "Hello %s! from %s." x y in
    `String msg |> respond |> return
  end

There is something very unsatisfying in the code above. We are attaching a function to a URL that contains 2 parameters x and y. We know in the body of the function that the parameters are always there (otherwise our handler would not be executed) and yet we don’t express this deduction statically.

Can we do better? Yes, and I’ll show you how. Let’s start by pretending that a solution exists:

let x =
  put (s "/hello" </> str </> s "from" </> str) begin fun x y ->
    let msg = sprintf "Hello %s! from %s." x y in
    `String msg |> respond |> return
  end

This is much better. The function that we are binding to the route is now typed. The combinators used to specify the routes should be familiar, but a brief explanation is in order. s x will consume a portion of the url defined by x. </> concatenates 2 routes sequentially. str specifies a string parameter that ends with a "/".

The advantages of this approach include:

  • We don’t have to extract the parameters manually out of the request or even name the parameters. Although, we can imagine adding support for that if it proves useful.

  • The compiler will warn us if we haven’t used an argument.

  • We can now specify the types of the parameters we accept. We can imagine using int to specify a numerical parameter. In fact, we can probably think of a host of combinators to add various functionality.

The advantages are obvious, but how can we implement it?

Aspiring OCaml hackers might want give this a problem a crack themselves. To specify the problem a little better, and perhaps give a hint to the solution this is interface which we will be implementing:

type ('a, 'b) t

val s : string -> ('a, 'a) t

val (</>) : ('b, 'c) t -> ('a, 'b) t -> ('a, 'c) t

val str : ('a, string -> 'a) t

val int : ('a, int -> 'a) t

val match_url : ('a, 'b) t -> string -> 'b -> 'a option

The Solution

If you have a hard problem related to static typing, a good rule of thumb is to check if Oleg has already solved it. Indeed, the problem above falls under that category but under a different context. Typed Formatting attempts to define an EDSL for specifying a type safe printf/scanf. Our problem of type safe routes is just a subset of that since we only need scanf. Oleg’s solution in Haskell involves GADT’s, which have almost have worn off the novelty for me in OCaml.

We will also need a generic parser type ('a Parser.t is a parser that yields a value of type 'a). A simple monadic parser will do. Since it’s not particularly interesting, I will link my own implementation in the end. Here’s the signature for the parser (Although I don’t think we will need all of these operations).

open Core.Std
(* A Substring.t is analgous to a string * int * int
   The parser return None on failure. Some with the result and the rest
   of the substring on success. *)
type 'a t = Substring.t -> ('a * Substring.t) option
val return : 'a -> 'a t
val (>>=) : 'a t -> ('a -> 'b t) -> 'b t
(* validate/map input with f. If f returns none then the parser fails.
   If f return Some x then x is the result returned *)
val filter_map : 'a t -> f:('a -> 'b option) -> 'b t
(* run the parser against the string and return None if parsing fails. On
   success Some (x, rest) where x is the result and rest is the remaining
   string that needs to be parsed *)
val run : 'a t -> Substring.t -> ('a * Substring.t) option
(* little helpers *)
val drop_prefix : string -> unit t
val drop_while  : (char -> bool) -> unit t
val take_while  : (char -> bool) -> string t

Finally, here’s the type representing our DSL:

type (_, _) t =
  | Try_parse : unit Parser.t -> ('a, 'a) t
  | Parse     : 'b Parser.t -> ('a, 'b -> 'a) t
  | Concat    : ('b, 'c) t * ('a, 'b) t -> ('a, 'c) t

Enumerating the constructors:

  • Try_parse represents a parser whose result we discard. Hence we use (). It will be useful to implement s.

  • Parse represents a parser whose result we retain. It will be useful to implement typed parameters

  • Concat will be used to sequence routes. It will be useful to implement </>

The key to the whole solution is one simple function interpreting our DSL:

let rec ints : type a b . (a, b) t -> b -> a Parser.t =
  let open Option.Monad_infix in
  fun t f inp ->
    match t with
    | Try_parse p -> Parser.run p inp >>| fun ((), inp') -> (f, inp')
    | Parse p -> Parser.run p inp >>| fun (v, s) -> (f v, s)
    | Concat (a, b) ->
      ints a f inp >>= fun (vb, inp') ->
      ints b vb inp'

This function takes a typed route and a function corresponding to a handler and returns a parser that returns the result of our handler on success.

We just need a bit more massaging to fit our proposed interface:

let match_url t s cb =
  let s = Substring.of_string s in
  match ints t cb s with
  | None -> None
  | Some (x, subs) ->
    if subs |> Substring.to_string |> String.is_empty then
      Some x
    else (* we did not consume the whole string so no match *)
      None

The rest is a small matter of programming the combinators:

(* a little helper to make sure our params are not empty *)
let non_empty = Parser.filter_map ~f:(fun x ->
  if String.is_empty x
  then None
  else Some x)

let int = Parse (fun x ->
  (Char.is_digit
   |> Parser.take_while
   |> non_empty
   |> Parser.map ~f:Int.of_string) x)

let s x = Try_parse (Parser.drop_prefix x)

let (</>) x1 x2 =
  let lead_slash x = Concat (s "/", x) in
  Concat (x1, lead_slash x2)

let str = Parse (fun x -> (Parser.take_while ((<>) '/') |> non_empty) x)

A little test using oUnit:

let it_works _ =
  let test_path = "/user/johndoe/comment/123" in
  let p = s "/user" </> str </> s "comment" </> int in
  match match_url p test_path (fun user id ->
    assert_equal ~printer "johndoe" user;
    assert_equal 123 id
  ) with
  | None -> assert_failure "Failed to match url"
  | Some _ -> print_endling "tada"

Conclusion

I have a prototype of Opium using these routes but it’s not for public consumption. I’d like to totally revamp routes from a scalability standpoint first before I improve the API for routing. There are also many other approaches to this problem (mostly from the land of Haskell). Here’s some great resources:

  • Happstack - The “pioneer” of type safe routing

  • The most recent and promising servant. However, it covers a whole lot more than just routing.

  • Spock is a tiny Haskell web framework that is most similar to my approach, although it uses type families under the hood.

At the end, I’m not sure which approach is the best for OCaml so I’m not ready to commit to anything in Opium yet. However, one thing is for certain: I definitely intend to the have the current untyped approach as an option.

Here’s a gist of a working version of my library. You’re welcome to fork and experiment, please let me know if you find any bugs.

One last thing. My implementation above can be called an “initial encoding” of the EDSL. You can actually implement the EDSL as a “final encoding” without GADT’s. I don’t see the advantage in either approach so I went with the initial encoding. I reached for it first and it gives me an excuse to play with GADT’s. A solution that uses a final encoding will be left as an exercise to the reader. (Who else has fond memories of this phrase from back in college?)

Comments

comments powered by Disqus