Network for 1D segmentation


This section assumes the reader has already read through Convolutional Neural Networks (LeNet) for convolutional networks motivation and Fully Convolutional Networks (FCN) for 2D segmentation for segmentation standard network.


The fundamental notions behind segmentation have been explained in Fully Convolutional Networks (FCN) for 2D segmentation. A particularity here is that some of these notions will be applied to 1D segmentation. However, almost every Lasagne layer used for 2D segmentation have their respective 1D layer, so the implementation would look alike if the same model was used.


The BigBrain dataset is a 3D ultra-high resolution model of the brain reconstructed from 2D sections. We are interested in the outer part of the brain, the cortex. More precisely, we are interested in segmenting the 6 different layers of the cortex in 3D. Creating an expertly labelled training dataset with each 2D section (shown in figure 1) is unfeasible. Instead of giving as input a 2D image of one section of the brain, we give as input 1D vectors with information from across the cortex, extracted from smaller portions of manually labelled cortex as shown in Figure 2. The final dataset is not available yet, a preliminary version is available here .


Figure 1 : Big Brain section


Figure 2 : Ray extraction from segmentated cortex

We will call rays the vectors of size 200 going from outside the brain and through the cortex. As the images were stained for cell bodies, the intensity of each pixel of these rays represents the cell densities and sizes contained in the cortical layer to which the pixel belongs. Since the 6 cortical layers have different properties (cell density and size), the intensity profile can be used to detect boundaries of the cortical layers.

Each ray has 2 input channels, one representing the smoothed intensity and the other, the raw version, as shown in Figure 3. The next figure, Figure 4, shows the ground truth segmentation map, where each different color represent a different label. The purple color indicate that these pixels are outside the cortex, while the 6 other colors represent the 6 cortical layers. For example, the first layer of the cortex is between pixels ~ 35-55. The cortex for this sample starts at pixel ~35 and ends at pixel ~170.


Figure 3 : Raw and smooth intensity profiles (input channels)


Figure 4 : Cortical layers labels for this ray


We first started our experiment with more complex models, but we finally found that the simpler model present here had enough capacity to learn how and where the layer boundaries are. This model (depicted in Figure 5) is composed of 8 identical blocks, followed by a last convolution and a softmax non linearity.

Each block is composed of :

  • Batch Normalization layer
  • Rectify nonlinearity layer
  • Convolution layer, with kernel size 25, with enough padding such that the convolution does not change the feature resolution, and 64 features maps

The last convolution has kernel size 1 and number of classes feature maps. The softmax is then used to detect which of these classes is more likely for each pixel. Note that any input image size could be used here, since the model is built from locally connected layers exclusively.


Figure 5 : Model

Note that we didn’t use any pooling, because it was not needed. However, if pooling layers were used, an upsampling path would have been necessary to recover full spatial size of the input ray. Also, since each pixel of the output prediction has a receptive field that includes all of the input pixel, the network is able to extract enough contextual information.


The model outputs a vector of the same size as the input (here, 200). There are 7 class labels, including the 6 cortical layers and the ‘not in the brain yet’ label. You can see in Figure 6 below the output of the model for some ray. The top of the plot represent the ground truth segmentation, while the bottoms represent the predicted segmentation. As you can see, there is only a small number of pixels not correctly segmented.


Figure 6 : Ground truth (top) vs prediction (bottom) for 1 ray

However, since the purpose was to do 3D segmentation by using 1D segmentation of the rays, we needed to put back the rays on the brain section. After interpolation between those rays and smoothing, we get the results shown in Figure 7. The colored lines are from 3D meshes based on the prediction from the model, intersected with a 2D section, and the grayscale stripes correspond to the ground truth. As you can see, it achieves really good results on the small manually labelled sample, which extend well to previously unsegmented cortex.


Figure 7 : Results put on the brain section



  • Current code works with Python 2 only.
  • If you use Theano with GPU backend (e.g. with Theano flag device=cuda), you will need at least 12GB free in your video RAM.

The FCN implementation can be found in the following file:

Change the dataset_loaders/config.ini file and add the right path for the dataset:

shared_path = /path/to/DeepLearningTutorials/data/cortical_layers/

Folder indicated at section [cortical_layers] should contain a sub-folder named 6layers_segmentation (you can obtain it by just renaming the folder extracted from TrainingData190417.tar.gz) which should itself contain files:

  • training_cls_indices.txt
  • training_cls.txt
  • training_geo.txt
  • training_raw.txt
  • training_regions.txt

First define a bn+relu+conv block that returns the name of the last layer of the block. Since the implementation uses a dictionary variable net that keeps the layer’s name as key and the actual layer object as variable, the name of the last layer is sufficient

def bn_relu_conv(net, incoming_layer, depth, num_filters, filter_size, pad = 'same'):

    net['bn'+str(depth)] = BatchNormLayer(net[incoming_layer])
    net['relu'+str(depth)] = NonlinearityLayer( net['bn'+str(depth)], nonlinearity = rectify)
    net['conv'+str(depth)] = ConvLayer(net['relu'+str(depth)],
                num_filters = num_filters, filter_size = filter_size,
                pad = pad, nonlinearity=None)
    incoming_layer = 'conv'+str(depth)

    return incoming_layer

The model is composed of 8 of these blocks, as seen below. Note that the model implementation is very tweakable, since the depth (number of blocks), the type of block, the filter size are the number of filters can all be changed by user. However, the hyperparameters used here were:

  • filter_size = 25
  • n_filters = 64
  • depth = 8
  • block = bn_relu_conv
def build_model(input_var,
	    n_classes = 6,
	    nb_in_channels = 2,
        n_filters = 64,
        depth = 8,
        last_filter_size = 1,
        block = 'bn_relu_conv',
        out_nonlin = softmax):
    input_var : theano 3Dtensor shape(n_samples, n_in_channels, ray_length)
    filter_size : odd int (to fit with same padding)
    n_filters : int, number of filters for each convLayer
    n_classes : int, number of classes to segment
    depth : int, number of stacked convolution before concatenation
    last_filter_size : int, last convolution filter size to obtain n_classes feature maps
    out_nonlin : default=softmax, non linearity function

    net = {}

    net['input'] = InputLayer((None, nb_in_channels, 200), input_var)
    incoming_layer = 'input'

    #Convolution layers
    for d in range(depth):
        if block == 'bn_relu_conv':
            incoming_layer = bn_relu_conv(net, incoming_layer, depth = d,
                            num_filters= n_filters, filter_size=filter_size)

Finally, the last convolution and softmax are achieved by :

    #Output layer
    net['final_conv'] = ConvLayer(net[incoming_layer],
                    num_filters = n_classes,
                    filter_size = last_filter_size,
    incoming_layer = 'final_conv'

    #DimshuffleLayer and ReshapeLayer to fit the softmax implementation
    #(it needs a 1D or 2D tensor, not a 3D tensor)
    net['final_dimshuffle'] = DimshuffleLayer(net[incoming_layer], (0,2,1))
    incoming_layer = 'final_dimshuffle'

    layerSize = lasagne.layers.get_output(net[incoming_layer]).shape
    net['final_reshape'] = ReshapeLayer(net[incoming_layer],
                                # (200*batch_size,n_classes))
    incoming_layer = 'final_reshape'

    #This is the layer that computes the prediction
    net['last_layer'] = NonlinearityLayer(net[incoming_layer],
                    nonlinearity = out_nonlin)
    incoming_layer = 'last_layer'

    #Layers needed to visualize the prediction of the network
    net['probs_reshape'] = ReshapeLayer(net[incoming_layer],
                    (layerSize[0], layerSize[1], n_classes))
    incoming_layer = 'probs_reshape'

    net['probs_dimshuffle'] = DimshuffleLayer(net[incoming_layer], (0,2,1))

    return [net[l] for l in ['last_layer']], net

Running on a Titan X lasted for around 4 hours, ending with the following:

THEANO_FLAGS=device=cuda0,floatX=float32,dnn.conv.algo_fwd=time_once,dnn.conv.algo_bwd_data=time_once,dnn.conv.algo_bwd_filter=time_once,gpuarray.preallocate=1 python
EPOCH 412: Avg cost train 0.065615, acc train 0.993349, cost val 0.041758, acc val 0.984398, jacc val per class ['0: 0.981183', '1: 0.953546', '2: 0.945765', '3: 0.980471', '4: 0.914617', '5: 0.968710', '6: 0.971049'], jacc val 0.959335 took 31.422823 s
saving last model


If you use this tutorial, please cite the following papers:

  • References for BigBrain:
    • [pdf] Lewis, L.B. et al.: BigBrain: Initial Tissue Classification and Surface Extraction, HBM 2014.
    • [website] Amunts, K. et al.: “BigBrain: An Ultrahigh-Resolution 3D Human Brain Model”, Science (2013) 340 no. 6139 1472-1475, June 2013.
    • [pdf] Bludau, S. et al.: Two new Cytoarchitectonic Areas of the Human Frontal Pole, OHBM 2012.
    • [pdf] Lepage, C. et al.: Automatic Repair of Acquisition Defects in Reconstruction of Histology Sections of a Human Brain, HBM 2010.
  • [GitHub Repo] Francesco Visin, Adriana Romero - Dataset loaders: a python library to load and preprocess datasets. 2017

Papers related to Theano/Lasagne:

  • [pdf] Theano Development Team. Theano: A Python framework for fast computation of mathematical expresssions. May 2016.
  • [website] Sander Dieleman, Jan Schluter, Colin Raffel, Eben Olson, Søren Kaae Sønderby, Daniel Nouri, Daniel Maturana, Martin Thoma, Eric Battenberg, Jack Kelly, Jeffrey De Fauw, Michael Heilman, diogo149, Brian McFee, Hendrik Weideman, takacsg84, peterderivaz, Jon, instagibbs, Dr. Kashif Rasul, CongLiu, Britefury, and Jonas Degrave, “Lasagne: First release.” (2015).


This work was done in collaboration with Konrad Wagstyl, PhD student, University of Cambridge. We would like to thank Professor Alan Evans’ [MCIN lab] and Professor Katrin Amunts’ [INM-1 lab].

Thank you!