Extending Theano

This tutorial covers how to extend Theano with novel ops. It mainly focuses on ops that offer a Python implementation, refers to Extending Theano with a C Op for C-based op. The first section of this tutorial introduces the Theano Graphs, as providing a novel Theano op requires a basic understanting of the Theano Graphs. It then proposes an overview of the most important methods that define an op.

As an illustration, this tutorial shows how to write a simple Python-based op which performs operations on Double. It also shows how to implement tests that ensure the proper working of an op.


This tutorial does not cover how to make an op that returns a view or modifies the values in its inputs. Thus, all ops created with the instructions described here MUST return newly allocated memory or reuse the memory provided in the parameter output_storage of the perform() function. See Views and inplace operations for an explanation on how to do this.

If your op returns a view or changes the value of its inputs without doing as prescribed in that page, Theano will run, but will return correct results for some graphs and wrong results for others.

It is recommended that you run your tests in DebugMode (Theano flag mode=DebugMode) since it verifies if your op behaves correctly in this regard.


See the Developer Start Guide for information regarding the versioning framework, namely about git and GitHub, regarding the development workflow and how to make a quality contribution.

Theano Graphs


Theano represents symbolic mathematical computations as graphs. Those graphs are bi-partite graphs (graphs with 2 types of nodes), they are composed of interconnected Apply and Variable nodes. Variable nodes represent data in the graph, either inputs, outputs or intermediary values. As such, Inputs and Outputs of a graph are lists of Theano Variable nodes. Apply nodes perform computation on these variables to produce new variables. Each Apply node has a link to an instance of Op which describes the computation to perform. This tutorial details how to write such an Op instance. Please refers to Graph Structures for a more detailed explanation about the graph structure.

Op Structure

An op is any Python object which inherits from gof.Op. This section provides an overview of the methods you typically have to implement to make a new op. It does not provide extensive coverage of all the possibilities you may encounter or need. For that refer to Op’s contract.

import theano

class MyOp(theano.Op):
    # Properties attribute
    __props__ = ()

    def make_node(self, *inputs):

    # Python implementation:
    def perform(self, node, inputs_storage, output_storage):

    # Other type of implementation
    # C implementation: [see theano web site for other functions]
    def c_code(self, node, inputs, outputs, sub):

    # Other implementations (pycuda, ...):
    def make_thunk(self, node, storage_map, _, _2):

    # optional:
    check_input = True

    def __init__(self, *args):

    def grad(self, inputs, g):

    def R_op(self, inputs, eval_points):

    def infer_shape(node, input_shapes):

An op has to implement some methods defined in the the interface of gof.Op. More specifically, it is mandatory for an op to define the method make_node() and one of the implementation methods, either perform(), Op.c_code() or make_thunk().

make_node() method creates an Apply node representing the application of the op on the inputs provided. This method is reponsible for three things:

  • it first checks that the input Variables types are compatible with the current op. If the op cannot be applied on the provided input types, it must raises an exception (such as TypeError).
  • it operates on the Variables found in *inputs in Theano’s symbolic language to infer the type of the symbolic output Variables. It creates output Variables of a suitable symbolic Type to serve as the outputs of this op’s application.
  • it creates an Apply instance with the input and output Variable, and return the Apply instance.

perform() method defines the Python implementation of an op. It takes several arguments:

  • node is a reference to an Apply node which was previously obtained via the Op‘s make_node() method. It is typically not used in simple ops, but it contains symbolic information that could be required for complex ops.
  • inputs is a list of references to data which can be operated on using non-symbolic statements, (i.e., statements in Python, Numpy).
  • output_storage is a list of storage cells where the output is to be stored. There is one storage cell for each output of the op. The data put in output_storage must match the type of the symbolic output. It is forbidden to change the length of the list(s) contained in output_storage. A function Mode may allow output_storage elements to persist between evaluations, or it may reset output_storage cells to hold a value of None. It can also pre-allocate some memory for the op to use. This feature can allow perform to reuse memory between calls, for example. If there is something preallocated in the output_storage, it will be of the good dtype, but can have the wrong shape and have any stride pattern.

perform() method must be determined by the inputs. That is to say, when applied to identical inputs the method must return the same outputs.

gof.Op allows some other way to define the op implentation. For instance, it is possible to define Op.c_code() to provide a C-implementation to the op. Please refers to tutorial Extending Theano with a C Op for a description of Op.c_code() and other related c_methods. Note that an op can provide both Python and C implementation.

make_thunk() method is another alternative to perform(). It returns a thunk. A thunk is defined as a zero-arguments function which encapsulates the computation to be performed by an op on the arguments of its corresponding node. It takes several parameters:

  • node is the Apply instance for which a thunk is requested,
  • storage_map is a dict of lists which maps variables to a one-element lists holding the variable’s current value. The one-element list acts as pointer to the value and allows sharing that “pointer” with other nodes and instances.
  • compute_map is also a dict of lists. It maps variables to one-element lists holding booleans. If the value is 0 then the variable has not been computed and the value should not be considered valid. If the value is 1 the variable has been computed and the value is valid. If the value is 2 the variable has been garbage-collected and is no longer valid, but shouldn’t be required anymore for this call. The returned function must ensure that it sets the computed variables as computed in the compute_map.

make_thunk() is useful if you want to generate code and compile it yourself. For example, this allows you to use PyCUDA to compile GPU code.

If make_thunk() is defined by an op, it will be used by Theano to obtain the op’s implementation. perform() and Op.c_code() will be ignored.

Other methods can be optionally defined by the op.

The __str__() method provides a meaningful string representation of your op.

__eq__() and __hash__() define respectivelly equality between two ops and the hash of an op instance. They will be used by the optimization phase to merge nodes that are doing equivalent computations (same inputs, same operation). Two ops that are equal according __eq__() should return the same output when they are applied on the same inputs.

The __props__ lists the properties that influence how the computation is performed (Ususally these are those that you set in __init__()). It must be a tuple. If you don’t have any properties, then you should set this attribute to the emtpy tuple ().

__props__ enables the automatic generation of appropriate __eq__() and __hash__(). Given the method __eq__(), automatically generated from __props__, two ops will be equal if they have the same values for all the properties listed in __props__. Given to the method __hash__() automatically generated from __props__, two ops will be have the same hash if they have the same values for all the properties listed in __props__. __props__ will also generate a suitable __str__() for your op. This requires development version after September 1st, 2014 or version 0.7.

The infer_shape() method allows to infer the shape of the op output variables, without actually computing the outputs. It takes as input node, a reference to the op Apply node, and a list of Theano symbolic Varables (i0_shape, i1_shape, ...) which are the shape of the op input Variables. infer_shape() returns a list where each element is a tuple representing the shape of one output. This could be helpful if one only needs the shape of the output instead of the actual outputs, which can be useful, for instance, for optimization procedures.

The grad() method is required if you want to differentiate some cost whose expression includes your op. The gradient may be specified symbolically in this method. It takes two arguments inputs and output_gradients which are both lists of symbolic Theano Variables and those must be operated on using Theano’s symbolic language. The grad method must return a list containing one Variable for each input. Each returned Variable represents the gradient with respect to that input computed based on the symbolic gradients with respect to each output. If the output is not differentiable with respect to an input then this method should be defined to return a variable of type NullType for that input. Likewise, if you have not implemented the grad computation for some input, you may return a variable of type NullType for that input. Please refer to grad() for a more detailed view.

The R_op() method is needed if you want theano.tensor.Rop to work with your op. This function implements the application of the R-operator on the function represented by your op. Let assume that function is f, with input x, applying the R-operator means computing the Jacobian of f and right-multiplying it by v, the evaluation point, namely: \frac{\partial f}{\partial x} v.

The optional boolean check_input attribute is used to specify if you want the types used in your op to check their inputs in their c_code. It can be used to speed up compilation, reduce overhead (particularly for scalars) and reduce the number of generated C files.

Op Example

import theano

class DoubleOp(theano.Op):
    __props__ = ()

    def make_node(self, x):
        # check that the theano version has support for __props__.
        # This next line looks like it has a typo,
        # but it's actually a way to detect the theano version
        # is sufficiently recent to support the use of __props__.
        assert hasattr(self, '_props'), "Your version of theano is too old to support __props__."
        x = theano.tensor.as_tensor_variable(x)
        return theano.Apply(self, [x], [x.type()])

    def perform(self, node, inputs, output_storage):
        x = inputs[0]
        z = output_storage[0]
        z[0] = x * 2

    def infer_shape(self, node, i0_shapes):
        return i0_shapes

    def grad(self, inputs, output_grads):
        return [output_grads[0] * 2]

    def R_op(self, inputs, eval_points):
        # R_op can receive None as eval_points.
        # That mean there is no diferientiable path through that input
        # If this imply that you cannot compute some outputs,
        # return None for those.
        if eval_points[0] is None:
            return eval_points
        return self.grad(inputs, eval_points)

You can try it as follows:

x = theano.tensor.matrix()
f = theano.function([x], DoubleOp()(x))
import numpy
inp = numpy.random.rand(5, 4)
out = f(inp)
assert numpy.allclose(inp * 2, out)
[[ 0.02443785  0.67833979  0.91954769  0.95444365]
 [ 0.60853382  0.7770539   0.78163219  0.92838837]
 [ 0.04427765  0.37895602  0.23155797  0.4934699 ]
 [ 0.20551517  0.7419955   0.34500905  0.49347629]
 [ 0.24082769  0.49321452  0.24566545  0.15351132]]
[[ 0.04887571  1.35667957  1.83909538  1.90888731]
 [ 1.21706764  1.55410779  1.56326439  1.85677674]
 [ 0.08855531  0.75791203  0.46311594  0.9869398 ]
 [ 0.41103034  1.48399101  0.69001811  0.98695258]
 [ 0.48165539  0.98642904  0.4913309   0.30702264]]

Example for properties of a Op

We can modify the previous piece of code in order to demonstrate the usage of the __props__ attribute.

We create an Op that takes a variable x and returns a*x+b. We want to say that two such ops are equal when their values of a and b are equal.

import theano

class AXPBOp(theano.Op):
    This creates an Op that takes x to a*x+b.
    __props__ = ("a", "b")

    def __init__(self, a, b):
        self.a = a
        self.b = b
        super(AXPBOp, self).__init__()

    def make_node(self, x):
        # check that the theano version has support for __props__.
        assert hasattr(self, '_props'), "Your version of theano is too old to support __props__."
        x = theano.tensor.as_tensor_variable(x)
        return theano.Apply(self, [x], [x.type()])

    def perform(self, node, inputs, output_storage):
        x = inputs[0]
        z = output_storage[0]
        z[0] = self.a * x + self.b

    def infer_shape(self, node, i0_shapes):
        return i0_shapes

    def grad(self, inputs, output_grads):
        return [a * output_grads[0] + b]

The use of __props__ saves the user the trouble of implementing __eq__() and __hash__() manually. It also generates a default __str__() method that prints the attribute names and their values.

We can test this by running the following segment:

mult4plus5op = AXPBOp(4, 5)
another_mult4plus5op = AXPBOp(4, 5)
mult2plus3op = AXPBOp(2, 3)

assert mult4plus5op == another_mult4plus5op
assert mult4plus5op != mult2plus3op

x = theano.tensor.matrix()
f = theano.function([x], mult4plus5op(x))
g = theano.function([x], mult2plus3op(x))

import numpy
inp = numpy.random.rand(5, 4).astype(numpy.float32)
assert numpy.allclose(4 * inp + 5, f(inp))
assert numpy.allclose(2 * inp + 3, g(inp))

How To Test it

Theano has some functionalities to simplify testing. These help test the infer_shape, grad and R_op methods. Put the following code in a file and execute it with the theano-nose program.

Basic Tests

Basic tests are done by you just by using the op and checking that it returns the right answer. If you detect an error, you must raise an exception. You can use the assert keyword to automatically raise an AssertionError.

import numpy
import theano

from theano.tests import unittest_tools as utt
from theano import config
class test_Double(utt.InferShapeTester):
    def setUp(self):
        super(test_Double, self).setUp()
        self.op_class = DoubleOp
        self.op = DoubleOp()

    def test_basic(self):
        x = theano.tensor.matrix()
        f = theano.function([x], self.op(x))
        inp = numpy.asarray(numpy.random.rand(5, 4), dtype=config.floatX)
        out = f(inp)
        # Compare the result computed to the expected value.
        utt.assert_allclose(inp * 2, out)

We call utt.assert_allclose(expected_value, value) to compare NumPy ndarray.This raise an error message with more information. Also, the default tolerance can be changed with the Theano flags config.tensor.cmp_sloppy that take values in 0, 1 and 2. The defaul value do the most strict comparison, 1 and 2 make less strict comparison.

Testing the infer_shape

When a class inherits from the InferShapeTester class, it gets the self._compile_and_check method that tests the op’s infer_shape method. It tests that the op gets optimized out of the graph if only the shape of the output is needed and not the output itself. Additionally, it checks that the optimized graph computes the correct shape, by comparing it to the actual shape of the computed output.

self._compile_and_check compiles a Theano function. It takes as parameters the lists of input and output Theano variables, as would be provided to theano.function, and a list of real values to pass to the compiled function. It also takes the op class as a parameter in order to verify that no instance of it appears in the shape-optimized graph.

If there is an error, the function raises an exception. If you want to see it fail, you can implement an incorrect infer_shape.

When testing with input values with shapes that take the same value over different dimensions (for instance, a square matrix, or a tensor3 with shape (n, n, n), or (m, n, m)), it is not possible to detect if the output shape was computed correctly, or if some shapes with the same value have been mixed up. For instance, if the infer_shape uses the width of a matrix instead of its height, then testing with only square matrices will not detect the problem. This is why the self._compile_and_check method prints a warning in such a case. If your op works only with such matrices, you can disable the warning with the warn=False parameter.

from theano.tests import unittest_tools as utt
from theano import config
class test_Double(utt.InferShapeTester):
    # [...] as previous tests.
    def test_infer_shape(self):
        x = theano.tensor.matrix()
        self._compile_and_check([x],  # theano.function inputs
                                [self.op(x)],  # theano.function outputs
                                # Always use not square matrix!
                                # inputs data
                                [numpy.asarray(numpy.random.rand(5, 4),
                                # Op that should be removed from the graph.

Testing the gradient

The function verify_grad verifies the gradient of an op or Theano graph. It compares the analytic (symbolically computed) gradient and the numeric gradient (computed through the Finite Difference Method).

If there is an error, the function raises an exception. If you want to see it fail, you can implement an incorrect gradient (for instance, by removing the multiplication by 2).

def test_grad(self):
                                            [numpy.random.rand(5, 7, 2)])

Testing the Rop

The class RopLop_checker defines the functions RopLop_checker.check_mat_rop_lop(), RopLop_checker.check_rop_lop() and RopLop_checker.check_nondiff_rop(). These allow to test the implementation of the Rop method of a particular op.

For instance, to verify the Rop method of the DoubleOp, you can use this:

import numpy
import theano.tests
from theano.tests.test_rop import RopLop_checker
class test_DoubleRop(RopLop_checker):
    def setUp(self):
        super(test_DoubleRop, self).setUp()
    def test_double_rop(self):
        self.check_rop_lop(DoubleRop()(self.x), self.in_shape)

Testing GPU Ops

Ops to be executed on the GPU should inherit from the theano.sandbox.cuda.GpuOp and not theano.Op. This allows Theano to distinguish them. Currently, we use this to test if the NVIDIA driver works correctly with our sum reduction code on the GPU.

Running Your Tests

To perform your tests, you may select either one of the three following methods:


The method of choice to conduct tests is to run the file theano-nose. In a regular Theano installation, the latter will be on the operating system’s path and directly accessible from any folder. Otherwise, it can be accessed in the Theano/bin folder. The following command lines may be used for the corresponding purposes:

  • theano-nose --theano: Run every test found in Theano’s path.
  • theano-nose folder_name: Run every test found in the folder folder_name.
  • theano-nose test_file.py: Run every test found in the file test_file.py.

The following are particularly useful for development purposes since they call for particular classes or even for particular tests:

  • theano-nose test_file.py:test_DoubleRop: Run every test found inside the class test_DoubleRop.
  • theano-nose test_file.py:test_DoubleRop.test_double_op: Run only the test test_double_op in the class test_DoubleRop.

Help with the use and functionalities of theano-nose may be obtained by running it with the command line parameter --help (-h).


The command nosetests can also be used. Although it lacks the useful functionalities that theano-nose provides, nosetests can be called similarly to theano-nose from any folder in Python’s path like so:

nosetests [suffix similar to the above].

More documentation on nosetests is available here: nosetests.


One may also add a block of code similar to the following at the end of the file containing a specific test of interest and run the file. In this example, the test test_DoubleRop in the class test_double_op would be performed.

if __name__ == '__main__':
   t = test_DoubleRop("test_double_rop")

We recommend that when we execute a file, we run all tests in that file. This can be done by adding this at the end of your test files:

if __name__ == '__main__':


Run the code of the DoubleOp example above.

Modify and execute to compute: x * y.

Modify and execute the example to return two outputs: x + y and x - y.

You can omit the Rop functions. Try to implement the testing apparatus described above.

(Notice that Theano’s current elemwise fusion optimization is only applicable to computations involving a single output. Hence, to gain efficiency over the basic solution that is asked here, the two operations would have to be jointly optimized explicitly in the code.)


as_op is a python decorator that converts a python function into a basic Theano op that will call the supplied function during execution.

This isn’t the recommended way to build an op, but allows for a quick implementation.

It takes an optional infer_shape() parameter that must have this signature:

def infer_shape(node, input_shapes):
     # ...
     return output_shapes
  • input_shapes and output_shapes are lists of tuples that represent the shape of the corresponding inputs/outputs.


Not providing the infer_shape method prevents shape-related optimizations from working with this op. For example your_op(inputs, ...).shape will need the op to be executed just to get the shape.


As no grad is defined, this means you won’t be able to differentiate paths that include this op.


It converts the Python function to a callable object that takes as inputs Theano variables that were declared.

as_op Example

import theano
import numpy
from theano import function
from theano.compile.ops import as_op

def infer_shape_numpy_dot(node, input_shapes):
    ashp, bshp = input_shapes
    return [ashp[:-1] + bshp[-1:]]

@as_op(itypes=[theano.tensor.fmatrix, theano.tensor.fmatrix],
       otypes=[theano.tensor.fmatrix], infer_shape=infer_shape_numpy_dot)
def numpy_dot(a, b):
   return numpy.dot(a, b)

You can try it as follows:

x = theano.tensor.fmatrix()
y = theano.tensor.fmatrix()
f = function([x, y], numpy_dot(x, y))
inp1 = numpy.random.rand(5, 4).astype('float32')
inp2 = numpy.random.rand(4, 7).astype('float32')
out = f(inp1, inp2)


Run the code of the numpy_dot example above.

Modify and execute to compute: numpy.add and numpy.subtract.

Modify and execute the example to return two outputs: x + y
and x - y.

Random numbers in tests

Making tests errors more reproducible is a good practice. To make your tests more reproducible, you need a way to get the same random numbers. You can do this by seeding NumPy’s random number generator.

For convenience, the classes InferShapeTester and RopLop_checker already do this for you. If you implement your own setUp function, don’t forget to call the parent setUp function.

For more details see Using Random Values in Test Cases.



See Documentation Documentation AKA Meta-Documentation, for some information on how to generate the documentation.

Here is an example how to add docstring to a class.

import theano

class DoubleOp(theano.Op):
    """ Double each element of a tensor.

    :param x: input tensor.

    :return: a tensor of the same shape and dtype as the input with all
    values doubled.

        this is a test note

        You can use the elemwise op to replace this example.
        Just execute `x * 2` with x being a Theano variable.

    .. versionadded:: 0.6

This is how it will show up for files that we auto-list in the library documentation:

class theano.misc.doubleop.DoubleOp(use_c_code='/usr/bin/g++')

Double each element of a tensor.

Parameters:x – input tensor.
Returns:a tensor of the same shape and dtype as the input with all values doubled.
Note:this is a test note
Seealso:You can use the elemwise op to replace this example. Just execute x * 2 with x being a Theano variable.

New in version 0.6.

Final Note

A more extensive discussion of this section’s content may be found in the advanced tutorial Extending Theano.

The section Other ops includes more instructions for the following specific cases: