Create a Network
In the previous chapters, we learned that in Juice everything is build by
layers and that the constructed thing is again a layer, which means it can
function as a new building block for something bigger. This is possible, because
a Layer
can implement any behavior as long as it takes an input and produces
an output.
In 2.1 Layer Lifecycle
we have seen, that only one LayerConfig
can be used to turn it via
Layer::from_config
into an actual Layer
. But as Deep Learning relies on
chaining multiple layers together, we need a Layer
, who implements this
behavior for us.
Enter the container layers.
Networks via the Sequential
layer
A Sequential
Layer is a layer of type container layer. The config of a
container layer has a special method called,
.add_layer
which takes one LayerConfig
and adds it to an ordered list in the
SequentialConfig
.
When turning a SequentialConfig
into a Layer
by passing the config to
Layer::from_config
, the behavior of the Sequential
is to initialize all the
layers which were added via .add_layer
and connect the layers with each other.
This means, the output of one layer becomes the input of the next layer in the
list.
The input of a sequential Layer
becomes the input of the
first layer in the sequential worker, the sequential worker then takes care
of passing the input through all the layers and the output of the last layer
then becomes the output of the Layer
with the sequential worker. Therefore
a sequential Layer
fulfills the requirements of a Layer
- take an input,
return an output.
// short form for: &LayerConfig::new("net", LayerType::Sequential(cfg))
let mut net_cfg = SequentialConfig::default();
net_cfg.add_input("data", &vec![batch_size, 28, 28]);
net_cfg.add_layer(LayerConfig::new("reshape", ReshapeConfig::of_shape(&vec![batch_size, 1, 28, 28])));
net_cfg.add_layer(LayerConfig::new("conv", ConvolutionConfig { num_output: 20, filter_shape: vec![5], stride: vec![1], padding: vec![0] }));
net_cfg.add_layer(LayerConfig::new("pooling", PoolingConfig { mode: PoolingMode::Max, filter_shape: vec![2], stride: vec![2], padding: vec![0] }));
net_cfg.add_layer(LayerConfig::new("linear1", LinearConfig { output_size: 500 }));
net_cfg.add_layer(LayerConfig::new("sigmoid", LayerType::Sigmoid));
net_cfg.add_layer(LayerConfig::new("linear2", LinearConfig { output_size: 10 }));
net_cfg.add_layer(LayerConfig::new("log_softmax", LayerType::LogSoftmax));
// set up the sequential layer aka. a deep, convolutional network
let mut net = Layer::from_config(backend.clone(), &net_cfg);
As a sequential layer is like any other layer, we can use sequential layers as building blocks for larger networks. Important building blocks of a network can be grouped into a sequential layer and published as a crate for others to use.
// short form for: &LayerConfig::new("net", LayerType::Sequential(cfg))
let mut conv_net = SequentialConfig::default();
conv_net.add_input("data", &vec![batch_size, 28, 28]);
conv_net.add_layer(LayerConfig::new("reshape", ReshapeConfig::of_shape(&vec![batch_size, 1, 28, 28])));
conv_net.add_layer(LayerConfig::new("conv", ConvolutionConfig { num_output: 20, filter_shape: vec![5], stride: vec![1], padding: vec![0] }));
conv_net.add_layer(LayerConfig::new("pooling", PoolingConfig { mode: PoolingMode::Max, filter_shape: vec![2], stride: vec![2], padding: vec![0] }));
conv_net.add_layer(LayerConfig::new("linear1", LinearConfig { output_size: 500 }));
conv_net.add_layer(LayerConfig::new("sigmoid", LayerType::Sigmoid));
conv_net.add_layer(LayerConfig::new("linear2", LinearConfig { output_size: 10 }));
let mut net_cfg = SequentialConfig::default();
net_cfg.add_layer(conv_net);
net_cfg.add_layer(LayerConfig::new("linear", LinearConfig { output_size: 500 }));
net_cfg.add_layer(LayerConfig::new("log_softmax", LayerType::LogSoftmax));
// set up the 'big' network
let mut net = Layer::from_config(backend.clone(), &net_cfg);
Networks via other container layers
So far, there is only the sequential layer, but other container layers, with slightly different behaviors are conceivable. For example a parallel or concat layer in addition to the sequential layer.
How to 'train' or optimize the constructed network is topic of chapter 3. Solvers