Examples with rusty-machine

This directory gathers fully-fledged programs, each using a piece of
`rusty-machine`'s API.

## Overview

* [K-Means](#k-means)
* [SVM](#svm)
* [Neural Networks](#neural-networks)
* [Naïve Bayes](#naïve-bayes)

## The Examples

### K Means

#### Generating Clusters

[Generating Clusters](k-means_generating_clusters.rs) randomly generates data around a pair of clusters.
It then trains a K-Means model to learn new centroids from this sample.

The example shows a basic usage of the K-Means API - an Unsupervised model. We also show some basic usage
of [rulinalg](https://github.com/AtheMathmo/rulinalg) to generate the data.

Sample run:

```
cargo run --example k-means_generating_cluster
   Compiling rusty-machine v0.4.0 (file:///rusty-machine/rusty-machine)
     Running `target/debug/examples/k-means_generating_cluster`
K-Means clustering example:
Generating 2000 samples from each centroids:
⎡-0.5 -0.5⎤
⎣   0  0.5⎦
Training the model...
Model Centroids:
⎡-0.812 -0.888⎤
⎣-0.525  0.877⎦
Classifying the samples...
Samples closest to first centroid: 1878
Samples closest to second centroid: 2122
```

### SVM

#### Sign Learner

[Sign learner](svm-sign_learner.rs) constructs and evaluates a model that learns to recognize the sign of an input number.

The sample shows a basic usage of the SVM API. It also configures the SVM algorithm with a specific kernel (`HyperTan`).
Evaluations are run in a loop to log individual predictions and do some book keeping for reporting the performance at the end.
The salient part from `rusty-machine` is to use the `train` and `predict` methods of the SVM model.

The accuracy evaluation is simplistic, so the model manages 100% accuracy (which is *really* too simple an example).

Sample run:

```
cargo run --example svm-sign_learner
   Compiling rusty-machine v0.3.0 (file:///rusty-machine/rusty-machine)
     Running `target/debug/examples/svm-sign_learner`
Sign learner sample:
Training...
Evaluation...
-1000 -> -1: true
-900 -> -1: true
-800 -> -1: true
-700 -> -1: true
-600 -> -1: true
-500 -> -1: true
-400 -> -1: true
-300 -> -1: true
-200 -> -1: true
-100 -> -1: true
0 -> -1: true
100 -> 1: true
200 -> 1: true
300 -> 1: true
400 -> 1: true
500 -> 1: true
600 -> 1: true
700 -> 1: true
800 -> 1: true
900 -> 1: true
Performance report:
Hits: 20, Misses: 0
Accuracy: 100
```

### Neural Networks

#### AND Gate

[AND gate](nnet-and_gate.rs) makes an AND gate out of a perceptron.

The sample code generates random data to learn from.
The input data is like an electric signal between 0 and 1, with some jitter that makes it not quite 0 or 1.
By default, the code decides that any pair input "above"
(0.7, 0.7) is labeled as 1.0 (AND gate passing), otherwise labeled as 0.0 (AND gate blocking).
This means that the training set is biased toward learning the passing scenario: An AND gate passes
25% of the time on average, and we'd like it to learn it.

The test data uses only the 4 "perfect" inputs for a gate: (0.0, 0.0), (1.0, 0.0), etc.

The code generates 10,000 training data points by default. Please give it a try, and then change `SAMPLE`,
the number of training data points, and `THRESHOLD`, the value for "deciding" for a passing gate.

Sample run:

```
> cargo run --example nnet-and_gate
   Compiling rusty-machine v0.3.0 (file:///rusty-machine/rusty-machine)
     Running `target/debug/examples/nnet-and_gate`
AND gate learner sample:
Generating 10000 training data and labels...
Training...
Evaluation...
Got  Expected
0.00  0
0.00  0
0.96  1
0.01  0
Hits: 4, Misses: 0
Accuracy: 100%
```

### Naïve Bayes

#### Dog Classification

Suppose we have a population composed of red dogs and white dogs,
whose friendliness, furriness, and speed can be measured. In this
example we train a Naïve Bayes model to determine whether
a dog is white or red.

The group of white dogs are friendlier, furrier, and slower than
the red dogs. Given the color of a dog, friendliness, furriness,
and speed are independent of each other (a requirement of the Naïve
Bayes model).

In the example code we will generate our own data and then train
our model using it. This is a common technique used to validate
a model. We generate the data by sampling each of the dogs features
from Gaussian random variables. We will have a total of 6 Gaussian
random variables representing three features for both colors of dog.
As we are using Gaussian random variables we will use a Gaussian
Naive Bayes model. Once we have generated our data we will convert
it into `Matrix` structures and train our model.


Sample run:

```
$ cargo run --example naive_bayes_dogs
...
Predicted: Red; Actual: Red; Accurate? true
Predicted: Red; Actual: Red; Accurate? true
Predicted: White; Actual: Red; Accurate? false
Predicted: Red; Actual: White; Accurate? false
Predicted: Red; Actual: Red; Accurate? true
Predicted: White; Actual: White; Accurate? true
Predicted: White; Actual: White; Accurate? true
Predicted: White; Actual: White; Accurate? true
Predicted: White; Actual: White; Accurate? true
Predicted: Red; Actual: Red; Accurate? true
Accuracy: 822/1000 = 82.2%
```