jyafn

Crates.iojyafn
lib.rsjyafn
version0.3.1
sourcesrc
created_at2024-07-30 23:30:58.619098
updated_at2024-08-13 08:15:08.660802
descriptionComputational graphs for Data Science that compile to machine code
homepagehttps://github.com/viodotcom/jyafn
repositoryhttps://github.com/viodotcom/jyafn
max_upload_size
id1320463
size3,023,772
Pedro B. Arruda (tokahuke)

documentation

README

Just Your Average Function

PyPI - Version Crates.io Version docs.rs GitHub go.mod Go version GitHub Actions Workflow Status

jyafn is a project for enabling MLOps by using Computational graphs that compile to machine code, with a convenient and familiar Python interface.

💡 Don't forget to check out the docs for more in-depth info.

There is something going on!

Look at this little innocent piece of code:

import jyafn as fn
import numpy as np

@fn.func
def reduce_sum(mat: fn.tensor[2, 2]) -> fn.scalar:
    return np.sum(mat)

I know: it looks a bit funny but what if I told you that

  1. This compiles to machine code.
  2. You can still call it as a regular Python function.
  3. You can export it, load it and call it from Go (or Rust, or Python, or C!)

Neat, huh? It's basically tf.function + onnx in a single package!

A quick example

Let's write a silly function in jyafn:

@fn.func
def a_fun(a: fn.scalar, b: fn.scalar) -> fn.scalar:
    return 2.0 * a + b + 1.0

It's so silly that if you call it like you normally would, a_fun(2, 3), you get what you expect, 8. But that is not the fun part. The fun part is that you can export this function to a file:

with open("a_fun.jyafn", "wb") as f:
    f.write(a_fun.dump())

And now you can pass this file anywhere and it will work. Let's call it, for example, from Go:

// Read exported data:
code, err := os.ReadFile("a_fun.jyafn")
if err != nil {
    log.Fatal(err)
}

// Load the function:
fn, err := jyafn.LoadFunction(code)
if err != nil {
    log.Fatal(err)
}

// Call the function:
result, err := jyafn.Call[float64](
    fn,
    struct {
        a float64
        b float64
    }{a: 2.0, b: 3.0},
)
if err != nil {
    log.Fatal(err)
}

fmt.Println(result, "==", 8.0)

How to use it

For all cases, unfortuately you will need GNU's binutils (or equivalent) installed (it is not a build dependency!), since we need an assembler and a linker to finish QBE's job. In most computers, it's most likely already installed (as part of gcc or Python). However, this is a detail that you need to be aware when, e.g., building a Docker image. Also, jyafn is guaranteed not to work in Windows. For your specific programming environment, see below:

Python

Get the package from PyPI

This is the most convenient way of getting jyafn:

pip install jyafn

Build from source

Clone the repo, then

make install

This should do the trick. You can set the specific target Python version like so:

make install py=3.12

The default version is 3.11 at the moment.

At the moment, the Python version depends on the Rust compiler to work. It will compile jyafn from source. As such, you will need cargo, Rust's package manager, as well as maturin, the tool for building Python packages from Rust code. Maturin can be easily installed with pip:

pip install maturin

Go

You can use this as a Go module:

import "github.com/viodotcom/jyafn/jyafn-go/pkg/jyafn"

You will also need to install the libjyafn shared object in your system, which is available for your platform in the GitHub latest release.

Rust

You can get the latest version of the jyafn crate, the basis of this project, from crates.io directly:

cargo add jyafn

C

Jyafn is available to be used directly from C via the libjyafn shared object that is available in the GitHub latest release. Please check the Rust interface for details on how to use the available functions.

FAQ

There is something going on!

Yes, there is definitely something going on. What you see is basically a mini-JIT (just-in-time compiler). Your Python instructions (add this! multiply that!) are recorded in a computational graph, which is then compiled to machine code thanks to QBE, which does all the heavy-lifting, "compilery stuff". This code is exposed as a function pointer in the running program.

Isn't loading arbitrary code just a security hazard?

Yes, but the code produced by jyafn is far from arbitrary. First, it's pure: it doesn't cause any outside mutations of any kind. Second, since it is based on a computational graph, which is acyclic, it is guaranteed to finish (and the size of the code limits how much time it takes to run). Third, no machine code is exchanged: that code is only valid for the duration of the process in which it resides. It's the computational graph and not the code that is exchanged.

So, I can just write anything in it and it will work?

No, far from it:

  1. Your python function can only have constant-sized loops.
  2. The libraries you are using must be able to work with generic Python objects (i.e., support the magic dunder methods, __add__, __sub__, etc...). Thankfully, numpy does just that for its ndarrays.
  3. if-else is still a bit wonky (you can use the choose method instead). This can be solved in the future, if there is demand.
  4. By now, support is focused on DS uses. So, floats and bools are supported, but not other primitive types.
  5. Once in a while, a new function out of a litany of them, might not be supported. However, implementing it is very easy.

Is it faster than pure Python?

You bet! There is a benchmark in ./jyafn-python/tests/simple_graph.py at which you can take a look. One example showed a 10x speedup over vanilla CPython.

Which programming languages are supported?

By now, Go, Rust, Python and C. You can use the cjyafn library to port jyafn to your language. Compiled shared objects are available as GitHub releases for this repo.

What is the current status of the project?

It's mature enough for a test. Guinea pigs wanted!

Commit count: 0

cargo fmt