# Using Custom Entropy Models With `constriction`

- **Author:** Robert Bamler, University of Tuebingen
- **Initial Publication Date:** Jan 4, 2022

This is an interactive jupyter notebook.
You can read this notebook [online](https://github.com/bamler-lab/constriction/blob/main/examples/python/02-custom-entropy-models.ipynb) but if you want to execute any code, we recommend to [download](https://raw.githubusercontent.com/bamler-lab/constriction/main/examples/python/02-custom-entropy-models.ipynb) it.

More examples, tutorials, and reference materials are available at .

## Install Constriction

Before you start, install `constriction` and `scipy` by executing the following cell, then restart your jupyter kernel:

In [None]:
%pip install constriction~=0.3.5 scipy # (this will automatically also install numpy)

**Don't forget to restart your jupyter kernel.**
Then test if you can import `constriction`:

In [1]:
import constriction # This cell should produce no output (in particular, no error messages).

We further recommend that you read through the simple ["Hello, World"](https://github.com/bamler-lab/constriction/blob/main/examples/python/01-hello-world.ipynb) example before continuing with this example.

## Example 1: Fixed Custom Entropy model

Constriction provides a selection of [predefined entropy models and model families](https://bamler-lab.github.io/constriction/apidoc/python/stream/model.html).
However, sometimes these do not cover everyone's need.
Therefore, `constriction` also provides two escape hatches:

- You can use any univariate distribution from the popular `scipy` package and wrap it in a [`ScipyModel`](https://bamler-lab.github.io/constriction/apidoc/python/stream/model.html#constriction.stream.model.ScipyModel).
 This turns the model into exactly invertible fixed point arithmetic so that it can be used for entropy coding with `constriction`'s stream codes.
- If your model is not available in `scipy`, then you can use a [`CustomModel`](https://bamler-lab.github.io/constriction/apidoc/python/stream/model.html#constriction.stream.model.CustomModel) and explicitly provide your own cumulative distribution function (and its approximate inverse) as an arbitrary python function (that can, optionally, take additional model parameters as in Example 2 below).

In this first example, we'll use `scipy`'s Cauchy distribution to define a *fixed* (i.e., fully parametrized) entropy model.
In Example 2 below, we'll show how we can construct custom models that accept additional model parameters.

In [2]:
import constriction
import numpy as np
import scipy.stats

# Get a scipy model with fixed parameters (`loc` and `scale`). See below
# for an example of a model *family* with variable model parameters.
scipy_model = scipy.stats.cauchy(loc=10.2, scale=30.9)

# Quantize the model to bins of size 1 centered at integers from -100 to 100:
model = constriction.stream.model.ScipyModel(scipy_model, -100, 100)

# Encode and decode some message using the above model for all model parameters:
message = np.array([3, 2, 6, -51, -19, 5, 87], dtype=np.int32)
print(f"Original message: {message}")
encoder = constriction.stream.queue.RangeEncoder()
encoder.encode(message, model)
compressed = encoder.get_compressed()
print(f"compressed representation: {compressed}")

decoder = constriction.stream.queue.RangeDecoder(compressed)
reconstructed = decoder.decode(model, 7) # (decodes 7 symbols)
print(f"Reconstructed message: {reconstructed}")
assert np.all(reconstructed == message)

Original message: [ 3 2 6 -51 -19 5 87]
compressed representation: [1831109982 151859470]
Reconstructed message: [ 3 2 6 -51 -19 5 87]


### Using `CustomModel` Instead of `ScipyModel`

In the above example, we used the `ScipyModel` wrapper.
We could instead also have defined the model explicitly through a cumulative distribution function (CDF) and its (approximate) inverse (percent point function or PPF) by replacing:

```python
model = constriction.stream.model.ScipyModel(scipy_model, -100, 100)
```

with:

```python
model = constriction.stream.model.CustomModel(scipy_model.cdf, scipy_model.ppf, -100, 100)
```

The `CustomModel` wrapper allows you to easily define arbitrary custom entropy models:

```python
model = constriction.stream.model.CustomModel(
 lambda x: ... TODO ..., # define your CDF here
 lambda xi: ... TODO ..., # define an (approximate) inverse of the CDF here
 -100, 100) # (or on whichever range you want to define your model)
```

Only the first parameter (the CDF) must be carefully designed.
The second parameter (inverse CDF) does not need to be very accurate since it is only used to speed up an internal *exact* inversion of a fixed-point approximation of the CDF.
For details, please refer to the [API documentation](https://bamler-lab.github.io/constriction/apidoc/python/stream/model.html#constriction.stream.model.CustomModel).

## Example 2: *Parameterized* Custom Entropy Model

Example 1 above encoded and decoded a message of i.i.d. symbols, i.e., we used the same custom defined model for each symbol.
We can also use `ScipyModel` or `CustomModel` to define model *families* which still have some free parameters that we can then define for each symbol individually upon encoding or decoding:

In [3]:
import constriction
import numpy as np
import scipy.stats

# Get a scipy model *family*, i.e., a model with free parameters 
scipy_model_family = scipy.stats.cauchy

# Turn it into a a family of (eventually) quantized entropy models:
model_family = constriction.stream.model.ScipyModel(scipy_model_family, -100, 100)

# Provide model parameters for each symbol:
message = np.array([3, 2, 6, -51, -19, 5, 87 ], dtype=np.int32) # (same message as in Example 1)
locs = np.array([7.2, -1.4, 9.1, -60.1, 3.9, 8.1, 63.2], dtype=np.float64)
scales = np.array([4.3, 5.1, 6, 14.2, 31.9, 7.2, 10.7], dtype=np.float64)

# Encode and decode the message using individual model parameters for each symbol:
print(f"Original message: {message}")
encoder = constriction.stream.queue.RangeEncoder()
encoder.encode(message, model_family, locs, scales)
compressed = encoder.get_compressed()
print(f"compressed representation: {compressed}")

decoder = constriction.stream.queue.RangeDecoder(compressed)
reconstructed = decoder.decode(model_family, locs, scales)
print(f"Reconstructed message: {reconstructed}")
assert np.all(reconstructed == message)

Original message: [ 3 2 6 -51 -19 5 87]
compressed representation: [1123914182 166883892]
Reconstructed message: [ 3 2 6 -51 -19 5 87]


### Using `CustomModel` Instead of `ScipyModel`

Similar to the discussion of fully parameterized model under Example 1 above, we could again also define our model *family* wiwth a `CustomModel` instead of a `ScipyModel`.
Our provided CDF and PPF functions just have to accept additional arguments for the model parameters:

```python
model_family = constriction.stream.model.CustomModel(
 scipy_model_family.cdf, scipy_model_family.ppf, -100, 100)
```

or, more generally:

```python
model_family = constriction.stream.model.CustomModel(
 lambda x, model_param1, model_param2: ... TODO ..., # CDF
 lambda xi, model_param1, model_param2: ... TODO ..., # approximate inverse CDF
 -100, 100) # (or on whichever range you want to define your model)
```

The number of model parameters is arbitrary but must match the number of parameter-arrays provided when encoding or decoding.

## Further Reading

See Python API documentation for [`ScipyModel`](https://bamler-lab.github.io/constriction/apidoc/python/stream/model.html#constriction.stream.model.ScipyModel) and [`CustomModel`](https://bamler-lab.github.io/constriction/apidoc/python/stream/model.html#constriction.stream.model.CustomModel).