Previous topic

downsample – Down-Sampling

Next topic

config – Theano Configuration

This Page

gradient – Symbolic Differentiation

Platforms: Unix, Windows

Symbolic gradient is usually computed from tensor.grad(), which offers a more convenient syntax for the common case of wanting the gradient in some expressions with respect to a scalar cost. The grad_sources_inputs() function does the underlying work, and is more flexible, but is also more awkward to use when tensor.grad() can do the job.

gradient.grad_sources_inputs(sources, graph_inputs, warn_type=True)

A gradient source is a pair (r, g_r), in which r is a Variable, and g_r is a Variable that is a gradient wrt r.

This function traverses the graph backward from the r sources, calling op.grad(...) for all ops with some non-None gradient on an output.

The op.grad(...) functions are called like this:

op.grad(op.inputs[:], [total_gradient(v) for v in op.outputs])

This call to op.grad should return a list or tuple: one symbolic gradient per input. If op has a single input, then op.grad should return a list or tuple of length 1.

For each input wrt to which op is not differentiable, it should return None instead of a Variable instance.

If a source r receives a gradient from another source r2, then the effective gradient on r is the sum of both gradients.

Parameters:
  • sources (list of pairs of Variable: (v, gradient-on-v) to initialize the total_gradient dictionary) – gradients to back-propagate using chain rule
  • warn_type (bool) – True will trigger warnings via the logging module when the gradient on an expression has a different type than the original expression
  • graph_inputs (list of Variable) – variables considered to be constant (do not backpropagate through them)
Return type:

dictionary whose keys and values are of type Variable

Returns:

mapping from each Variable encountered in the backward traversal to its [total] gradient.