Optimize Layers

In the previous chapter 3. Solver, we learned what a solver is and what it does. In this chapter, we take a look on how to optimize a network via a Solver.

A Solver after its initialization has two Layers, one for the network which will be optimized and one for the objective. The output of the network layer is used by the objective to compute the loss. The loss is then used by the Solver to optimize the network.

The Solver has a very simple API - .train_minibatch and .network. The optimization of the network is kicked off by the .train_minibatch method, which takes two input parameters - some data that is feed to the network and the expected target value for the network.

A SGD (Stochastic Gradient Descent) Solver would now compute the output of the network using as input the data, put the output together with the expected target value into the objective layer and use it, together with the gradient of the network to optimize the weights of the network.

/// Train the network with one minibatch
pub fn train_minibatch(&mut self, mb_data: ArcLock<SharedTensor<f32>>, mb_target: ArcLock<SharedTensor<f32>>) -> ArcLock<SharedTensor<f32>> {
    // forward through network and classifier
    let network_out = self.net.forward(&[mb_data])[0].clone();
    let _ = self.objective.forward(&[network_out.clone(), mb_target]);

    // forward through network and classifier
    let classifier_gradient = self.objective.backward(&[]);
    self.net.backward(&classifier_gradient[0 .. 1]);

    self.worker.compute_update(&self.config, &mut self.net, self.iter);
    self.net.update_weights(self.worker.backend());
    self.iter += 1;

    network_out
}

Using the .train_minibatch is straight forward. We pass the data as well as the expected result of the network to the .train_minibatch method of the initialized Solver struct. A more detailed example can be found at the spearow/juice-examples repository.

let inp_lock = Arc::new(RwLock::new(inp));
let label_lock = Arc::new(RwLock::new(label));

// train the network!
let inferred_out = solver.train_minibatch(inp_lock.clone(), label_lock.clone());

If we don't want the network to be trained, we can use the .network method of the Solver to receive access to the network. The Solver has actually two network methods - .network and mut_network.

To run just the forward of the network without any optimization we can run

let inferred_out = solver.network().forward(inp_lock.clone());

Juice ships with a confusion matrix, which is a convenient way to visualize the performance of the optimized network.

let inferred_out = solver.train_minibatch(inp_lock.clone(), label_lock.clone());

let mut inferred = inferred_out.write().unwrap();
let predictions = confusion.get_predictions(&mut inferred);

confusion.add_samples(&predictions, &targets);
println!("Last sample: {} | Accuracy {}", confusion.samples().iter().last().unwrap(), confusion.accuracy());

A more detailed example can be found at the spearow/juice-examples repository.