# haskell_bits Various typeclasses and concepts from Haskell, implemented in Rust. ([crates.io link](https://crates.io/crates/haskell_bits)) ## Synopsis Currently this library is an implementation of the Functor/Applicative/Monad hierarchy in Rust. It's very much a work in progress, and I'm releasing it now largely as a request for comment, I'd like to have some discussions about my design choices with others before I go too far down this path. There has been some work in this space, examples are: * [`fp-core`](https://crates.io/crates/fp-core) * [`frunk`](https://crates.io/crates/frunk) * [`functional`](https://crates.io/crates/functional) ## So what makes this library different?! The one thing (I believe) makes this library unique is that one can define generic functions on Monads, Applicatives, Functors etc. The libraries mentioned above seem to define Functor, Applicative, Monad for various types, so `fmap`, `bind` etc, work on those types, but there doesn't seem to be a way to write one generic function that works on all types defined as, say, Monad. This, I think, is a very important issue. The great thing about these typeclasses like Monad in Haskell is that if you've got `m` different Monads, and `n` different functions defined on Monads, you've now got effectively `m*n` functions defined. Make a new Monad, you've got a huge infrastucture for free. Make a new Monad function, it works on 100s of Monadic types already defined by others. This is one of the key things that makes Monads and associated concepts, and more generally Haskell, so powerful, in that you can combine things in a multiplicative numbers of ways, so much so that many problems just consist of putting together existing bits of code, of which the type system ensures you can only do safely, and almost always, correctly. This is a huge advantage in terms of productivity and in particular, reliability. ## Example Here is a simple Haskell function: ```hs monadic_pair :: Monad m => m a -> m a -> m (a, a) monadic_pair x y = do x_val <- x y_val <- y pure (x_val, y_val) ``` And this is the equivalent using this library in Rust: ```rust fn monadic_pair(x: &TArg, y: &TArg) -> >::Type where TCon: Monad + WithTypeArg + WithTypeArg<(T, T)>, TArg: TypeApp, T: Clone, { mdo! { x_val =<< x; y_val =<< y; ret (x_val.clone(), y_val.clone()); } } ``` We can then apply this to vectors: ```rust let v1: Vec = vec![1, 2, 3]; let v2: Vec = vec![4, 5, 6]; let v_result = monadic_pair(&v1, &v2); assert_eq!( v_result, vec![ (1, 4), (1, 5), (1, 6), (2, 4), (2, 5), (2, 6), (3, 4), (3, 5), (3, 6) ] ); ``` Or options: ```rust let o1: Option = Some(1); let o2: Option = Some(2); let o3: Option = None; let o1_result = monadic_pair(&o1, &o2); let o2_result = monadic_pair(&o1, &o3); assert_eq!(o1_result, Some((1, 2))); assert_eq!(o2_result, None); ``` All with the expected results. We can even do the standard applicative style function application: ```rust let _applicative_option_result = (|x| move |y| x + y).lmap(o1).lap(o2); assert_eq!(Some(3), _applicative_option_result); ``` With some caveats I'll mention later. ## So how does it work? Rust does not have higher kinded types, well, at least ones we can abstract over. What I mean by this is that whilst Rust happily allows `Vec` for some generic `T`, Rust doesn't allow `T` for some generic `T`. It allows `Vec`, but we can't abstract over `Vec` like we can `u32`. This has some particular consequences. Lets say we want to write a trait in Rust that encapsulates the idea of applying a function to all parts of a structure. We might try: ```rust trait Mappable { map(f: impl Func, x: Self) -> Self; } impl Mappable for Vec { ... } impl Mappable for Option { ... } // etc... ``` #### A slight snag Except we can't do this. Rust will refuse to compile this, because `Vec`, `Option` (and by inference, `Self`) are not types. `Option` is, but `Option` is not. So we need to do a trick: ```rust pub trait WithTypeArg { type Type; } ``` And then: ```rust pub struct VecTypeCon; impl WithTypeArg for VecTypeCon { type Type = Vec; } ``` We can then define `Mappable` like so: ```rust pub trait Mappable { fn map( f: impl Fn(TIn) -> TOut, x: >::Type, ) -> >::Type } ``` And here's a simple implementation for `Option`s: ```rust impl Mappable for OptionTypeCon { fn lmap( f: impl Fn(TIn) -> TOut, x: >::Type, ) -> >::Type { Option::map(x, f) // Option itself already has a specific "map" function } } ``` and for `Vec`s: ```rust impl Mappable for VecTypeCon { fn map( f: impl Fn(TIn) -> TOut, x: >::Type, ) -> >::Type { let size = x.capacity(); let mut v: Vec = Vec::with_capacity(size); for e in x { v.push(f(e)); } v } } ``` This is pretty good so far, but there's a small issue, which we sort out. Notice the main motivation for this library is to be able to combine these functions in a generic way. Well lets make a silly function that just does two maps in a row (this is a silly function as it would be more efficient just to combine the functions, but let's go with it): ```rust fn map2( f: impl Fn(TIn) -> TMid, g: impl Fn(TMid) -> TOut, x: >::Type, ) -> >::Type { map(g, map(f, x)) } ``` This unfortunately doesn't work, even though it looks quite type-correct. I think it's because Rust won't match against a type unless it's fully resolved, and `>::Type` can't be resolved because we don't know what `TCon` or `TIn` is. Therefore we can't pass it to `map`, even though it wants an argument of `>::Type` (the exact same thing), because I think Rust wants to resolve them both before checking if they're equal. So what we do is adjust our definition of `WithTypeArg` ```rust pub trait WithTypeArg { type Type: TypeApp; } ``` And write this new trait `TypeApp`: ```rust pub trait TypeApp where TCon: WithTypeArg, Self: Is>::Type>, { } ``` Yes, notice these traits refer to each other. Also of importance is the `Is` trait. This does take some getting your head around, I would suggest looking at the source of my [`is_type`](https://docs.rs/crate/is_type/0.2.0/source/src/lib.rs) library. But basically what this allows one to do is to force a one-to-one correspondence between a trait and an actual type. So you can basically go back and forth between them at will. Then we can redefine `map` as follows, replacing any occurrence of an argument (but not return type) of: `>::Type` with: `impl TypeApp` ```rust pub trait Mappable { fn map( f: impl Fn(TIn) -> TOut, x: impl TypeApp, ) -> >::Type } ``` and likewise change `map2`: ```rust fn map2( f: impl Fn(TIn) -> TMid, g: impl Fn(TMid) -> TOut, x: impl TypeApp, ) -> >::Type { map(g, map(f, x)) } ``` and we're good. It seems for some reason not to pass `>::Type` to an argument of type `>::Type`, but it will pass `>::Type` to a trait argument of `TypeApp`. Note that in the implementation it means we'll have to replace all occurrences of `x` with `x.into_val()`, where `into_val` is part of the trait `Is` (in the where clause of `TypeApp`). Basically, this is the core of the approach. We just do a similar thing for Applicative and Monad, but there are a few details and snags that are run into which I'll detail below. The above class is called `Mappable`, whereas actually what I've detailed above it is called `LinearFunctor` in the library, which I'll also talk about now: ## The structure of the library. Currently, there are 7 main traits in the library: * `Functor` * `Lift` * `Applicative` * `Monad` * `LinearFunctor` * `LinearApplicative` * `LinearMonad` Functor, Applicative and Monad are basically copies from Haskell, except the `pure` function of `Applicative` is split into the trait `Lift`. So both `Lift` and `Applicative` extend `Functor`, and `Monad` extends both `Lift` and `Applicative`. The `LinearFunctor/LinearApplicative/LinearMonad` are the "by value" versions of `Functor/Applicative/Monad`. These typeclasses consume their arguments. This can be an efficiency gain because they don't need to copy their arguments, it also means they can be defined on types which aren't `Cloneable`. The standard prefix for a "normal" typeclass (i.e. handling data by reference) is either `f` (in Functor) or no prefix (otherwise), the "by value" versions (i.e. "linear") are prefixed by an `l`. For example, `fmap` is for Functor, and `lmap` is for LinearFunctor. Note out of the `Linear*` classes, `LinearFunctor` is a bit special. `Functor` extends `LinearFunctor`, that is, every `Functor` is also a `LinearFunctor` (but not the other way around). This is because if one defines `Functor`, one can always define `LinearFunctor` just by using a reference. Both `LinearApplicative` and `LinearMonad` take their function argument as an `FnOnce` argument. This is done, particularly in the case of `LinearApplicative`, to allow chaining. Running a produced closure twice will probably require an explicit `.clone()` at some point. That means `LinearApplicative` and `LinearMonad` can be defined for far less types, generally ones that don't "multiply". Ie. whilst `Vec` is a `LinearFunctor` it's not a `LinearApplicative` or `LinearMonad`, `Option` however is all of these. Also all trait functions have plain old top-level functions that call them, as well as often having other trait functions that call them, although these trait functions are only for the purpose of allowing `.` notation. I'll go into some technical details about the reasoning for these now. ## Technical details (particularly for implementors of traits) I'm going to give the `LinearFunctor` trait as an example. You've seen code similar to this before, but this is the actual code: ```rust // Implement this trait for LinearFunctor pub trait LinearFunctor { fn lmap( f: impl Fn(TIn) -> TOut, x: >::Type, ) -> >::Type where Self: WithTypeArg + WithTypeArg; } // Call this for lmap(f, x) syntax pub fn lmap( f: impl Fn(TIn) -> TOut, x: impl TypeApp, ) -> >::Type where TCon: LinearFunctor + WithTypeArg + WithTypeArg + ?Sized, { ::lmap(f, x.into_val()) } // This is for x.lmapop(f) syntax pub trait LMapExt { fn lmap(self, x: impl TypeApp + Sized) -> >::Type where Self: Fn(TIn) -> TOut + Sized, TCon: LinearFunctor + WithTypeArg + WithTypeArg { lmap(self, x) } } impl LMapExt for T {} ``` There are three functions here, one is just a definition, the second is an implementation that calls the first, and the third calls the second. The first is what you actually implement. Note that the arguments here are actual types. There's no `x: impl TypeApp`. This means, as discussed above, this function [works poorly](#A-slight-snag) when it comes to type inference. So the second function here, top level function, instead takes the `TypeApp` trait as it argument, which means it works better with type inference. But then here we need to worry about converting from the trait type to the "real" type, so we call `into_val()` from the `is_type` crate to do this. The third function is just a trait so we can use the `.` syntax if we want. This becomes useful if we want to chain things like operators, i.e. `f.lmap(x).lap(y)`. Note that only in the case of `Functor`, the function `map` works with both by value and by reference arguments, and calls either `fmap` or `lmap` depending on whether the arguments are by value or by reference. ## Do notation The macro `!mdo` allows one to write in "do-notation" form. This code is largely stolen (slightly modified) from [rust-mdo](https://github.com/TeXitoi/rust-mdo/blob/master/src/lib.rs). Do notation is currently only defined with the by-reference version of the Monad functions, so you'll see `&` and `Clone::clone(...)` in many places. Also, sometimes `Clone::clone` is better for type inference than `.clone()`. I believe it's because the latter works for both values and references. ## Quirks There's a few little quirks one gets from this approch, which makes things slightly messier than equivalent Haskell code in some cases. I'll talk about features that could be added to Rust in [this later section](#Features-that-could-be-added-to-Rust-to-make-things-nicer). ### `WithTypeArg` constraints everywhere When writing generic functions over Functors, Applicatives or Monads, one has to use the constraint `WithTypeArg` all over the place. Indeed the function `map2` above doesn't actually work, one needs to write it like this: ```rust fn lmap2( f: impl Fn(TIn) -> TMid, g: impl Fn(TMid) -> TOut, x: impl TypeApp, ) -> >::Type where TCon: LinearFunctor + WithTypeArg + WithTypeArg + WithTypeArg { lmap(g, lmap(f, x)) } ``` Notice all the `WithTypeArg + WithTypeArg + WithTypeArg` constraints. One has to put this constraint for every type argument application that even exists in the function. For example in `lmap2` above, we need `WithTypeArg` even though it's not an input or output argument, just because it's the result of the innermost map call. ### Closure types can't be named In stable rust, closure types can't be named, nor (as far as I could work out) can one define types that implement `Fn` (I think this requires extensions). This comes into play when trying to define some functions. For example, consider the definition of `Applicative`: ```rust pub trait Applicative: Functor + Lift { fn ap( f: &>::Type, x: &>::Type, ) -> >::Type where Self: WithTypeArg + WithTypeArg + WithTypeArg, TFunc: Fn(&TIn) -> TOut, { ::lift2(|y1: &TFunc, y2: &TIn| y1(y2), f, x) } fn lift2( f: TFunc, x1: &>::Type, x2: &>::Type, ) -> >::Type where Self: WithTypeArg + WithTypeArg + WithTypeArg, TFunc: Fn(&TIn1, &TIn2) -> TOut; } ``` We've defined `ap` in terms of `lift2`, but it would be nice to define it the other way around also, so implementers could choose which to implement. Roughly speaking, the [identity](http://hackage.haskell.org/package/base-4.12.0.0/docs/Control-Applicative.html#t:Applicative) is as follows: ```rust lift2(f, x, y) = f.fmap(x).ap(y) ``` But in Rust, without automatic currying, we'll need something like: ```rust lift2(f, x, y) = (|x| |y| f(x,y)).fmap(x).ap(y) ``` So in this case, the output of `fmap` is an Applicative of Functions. Eventually `ap` applies and this becomes just values, but the intermediate step is a function. But this function doesn't have a type we can name. As above, Rust wants a `WithTypeArg` constraint for every argument to our functions, but we can't write the type for this function because is doesn't have a type name. A practical issue with this is that you can't create say, an M<F>, where F is some function, inside a function. You have to always pass it in. ### Some fiddling with type inference Notice there's actually two `lift()` functions defined (`lift` is just `pure`, but `pure` was a former keyword so I've avoided it). ```rust // lift(x) pub fn lift(x: T) -> >::Type where >::Type : TypeApp, TCon: Lift + WithTypeArg, { ::lift::(x) } // lift_c(x) pub fn lift_c(x: U::Param) -> U where T : Is, U : TypeApp, >::Type : TypeApp, TCon: Lift + WithTypeArg, { Is::from_val(lift::(Is::from_val(x))) } ``` They both do the same thing, indeed the second just calls the first, but their effect on type inference is different. The first doesn't allow the inference to go backwards. For example, if one goes: ```rust x: Option = lift(5); ``` then the compiler will not be able to work this out, because `5` could be many different numeric types. But if one goes like this: ```rust x: Option = lift_c(5); ``` It will be able to work out from the result type that `5` is a `u32`. The problem is that `lift_c` doesn't work as well in generic functions over, say, all monads. So think of `lift_c` as "lift concrete", as in we have a concrete type. There's a similar division with `bind` and `bind_c`. As a result also, there's both `!mdo` and `!mdo_c` macros, `!mdo` should be used in generic code, but `!mdo_c` may be used when one knows the concrete types as there may be better inference in that case. One example of this is as follows: ```rust let o1: Option = lift_c(5); let o2: Option = lift_c(7); let _do_result: Option<(u32, u32)> = mdo_c! { x =<< &o1; y =<< &o2; ret (Clone::clone(x), Clone::clone(y)); }; ``` Notice here one doesn't have to be explicit about the type constructor of the return value out of the macro. But in `monadic_pair`: ```rust fn monadic_pair(x: &TArg, y: &TArg) -> >::Type where TCon: Monad + WithTypeArg + WithTypeArg<(T, T)>, TArg: TypeApp, T: Clone, { mdo! { x_val =<< x; y_val =<< y; ret (x_val.clone(), y_val.clone()); } } ``` We have to explicitly specify the type constructor of the result of the do-block, namely `TCon`. Rust can't infer this, unfortunately. ## Features that could be added to Rust to make things nicer Overall, despite this looking a bit messy in some places, for client code, it's not too bad. Generic functions over monads are a bit messy to define, but if what you're doing mostly is just using these functions on a particular concrete monad, the code is actually fairly clean. But here's some things that would help: ### Higher kinded types (HKT) An actual implementation and syntax for higher kinded types would be ideal, as we wouldn't need to do the "hack things into a trait" thing to make this work, and presumably type inference would also be nicer if this were done. But failing that, there are a few features which would go a long way to help: ### Forall in where clauses If we could define something like: ```rust trait TypeCon where TCon : forall U. WithTypeArg ``` This would solve the [`WithTypeArg` constraints everywhere](#WithTypeArg-constraints-everywhere) issue. It would also solve the [closure types can't be named](#Closure-types-can't-be-named) issue because they'd be no need to explicitly name such temporary types. ### Generic associated types (GAT) This is an actual feature under development which may help, similarly to a "forall" option, but the particular part of this feature needed hasn't landed in nightly yet and has no particular timeline, according to [this issue reply](https://github.com/rust-lang/rust/issues/44265#issuecomment-560941760). ## Future work There's a few obvious things to be done, firstly implementing Functor/Applicative/Monad for types other than `Option` and `Vec`, `Result` is the obvious next choice, along with other things in the standard Rust library. `IO` is also a possibility, and I think implementing some of Haskell's [`Parsec`](https://hackage.haskell.org/package/parsec) could allow for some good illistrative examples of the power of this. Also, some of the more useful monadic functions, like [`mapM`](http://hackage.haskell.org/package/base-4.12.0.0/docs/Control-Monad.html#v:mapM) require [traversable](http://hackage.haskell.org/package/base-4.12.0.0/docs/Data-Traversable.html#t:Traversable) and [foldable](http://hackage.haskell.org/package/base-4.12.0.0/docs/Data-Foldable.html#t:Foldable), so these are the obvious traits to implement next. ## Where to start digging around the code. There's a whole lot of test code dumped in the `test()` function in [`lib.rs`](https://github.com/clintonmead/haskell_bits/blob/master/src/lib.rs), that's probably the best place to start as there's plenty of examples of usage there. ## Discussion and comments welcome! The whole point of this was to open up discussion, particularly on core aspects of the design at this point, so please feel free to open up a github issue if you have any suggestions. Or even open up a PR if those changes are more concrete or would just like to add some functions, traits or implementations.