Release Notes

Theano 1.0.0 (15th of November, 2017)

This is a final release of Theano, version 1.0.0, with a lot of new features, interface changes, improvements and bug fixes.

We recommend that everybody update to this version.

Highlights (since 0.9.0):
  • Announcing that MILA will stop developing Theano
  • conda packages now available and updated in our own conda channel mila-udem To install it: conda install -c mila-udem theano pygpu
  • Support NumPy 1.13
  • Support pygpu 0.7
  • Moved Python 3.* minimum supported version from 3.3 to 3.4
  • Added conda recipe
  • Replaced deprecated package nose-parameterized with up-to-date package parameterized for Theano requirements
  • Theano now internally uses sha256 instead of md5 to work on systems that forbid md5 for security reason
  • Removed old GPU backend theano.sandbox.cuda. New backend theano.gpuarray is now the official GPU backend
  • Make sure MKL uses GNU OpenMP
    • NB: Matrix dot product (gemm) with mkl from conda could return wrong results in some cases. We have reported the problem upstream and we have a work around that raises an error with information about how to fix it.
  • Improved elemwise operations
    • Speed-up elemwise ops based on SciPy
    • Fixed memory leaks related to elemwise ops on GPU
  • Scan improvements
    • Speed up Theano scan compilation and gradient computation
    • Added meaningful message when missing inputs to scan
  • Speed up graph toposort algorithm
  • Faster C compilation by massively using a new interface for op params
  • Faster optimization step, with new optional destroy handler
  • Documentation updated and more complete
    • Added documentation for RNNBlock
    • Updated conv documentation
  • Support more debuggers for PdbBreakpoint
  • Many bug fixes, crash fixes and warning improvements

A total of 71 people contributed to this release since 0.9.0, see list below.

Interface changes:
  • Merged duplicated diagonal functions into two ops: ExtractDiag (extract a diagonal to a vector), and AllocDiag (set a vector as a diagonal of an empty array)
  • Removed op ExtractDiag from theano.tensor.nlinalg, now only in theano.tensor.basic
  • Generalized AllocDiag for any non-scalar input
  • Added new parameter target for MRG functions
  • Renamed MultinomialWOReplacementFromUniform to ChoiceFromUniform
  • Changed grad() method to L_op() in ops that need the outputs to compute gradient
  • Removed or deprecated Theano flags:
    • cublas.lib
    • cuda.enabled
    • enable_initial_driver_test
    • gpuarray.sync
    • home
    • lib.cnmem
    • nvcc.* flags
    • pycuda.init
Convolution updates:
  • Implemented separable convolutions for 2D and 3D
  • Implemented grouped convolutions for 2D and 3D
  • Added dilated causal convolutions for 2D
  • Added unshared convolutions
  • Implemented fractional bilinear upsampling
  • Removed old conv3d interface
  • Deprecated old conv2d interface
GPU:
  • Added a meta-optimizer to select the fastest GPU implementations for convolutions
  • Prevent GPU initialization when not required
  • Added disk caching option for kernels
  • Added method my_theano_function.sync_shared() to help synchronize GPU Theano functions
  • Added useful stats for GPU in profile mode
  • Added Cholesky op based on cusolver backend
  • Added GPU ops based on magma library: SVD, matrix inverse, QR, cholesky and eigh
  • Added GpuCublasTriangularSolve
  • Added atomic addition and exchange for long long values in GpuAdvancedIncSubtensor1_dev20
  • Support log gamma function for all non-complex types
  • Support GPU SoftMax in both OpenCL and CUDA
  • Support offset parameter k for GpuEye
  • CrossentropyCategorical1Hot and its gradient are now lifted to GPU
  • cuDNN:
    • Official support for v6.* and v7.*
    • Added spatial transformation operation based on cuDNN
    • Updated and improved caching system for runtime-chosen cuDNN convolution algorithms
    • Support cuDNN v7 tensor core operations for convolutions with runtime timed algorithms
    • Better support and loading on Windows and Mac
    • Support cuDNN v6 dilated convolutions
    • Support cuDNN v6 reductions for contiguous inputs
    • Optimized SUM(x^2), SUM(ABS(X)) and MAX(ABS(X)) operations with cuDNN reductions
    • Added new Theano flags cuda.include_path, dnn.base_path and dnn.bin_path to help configure Theano when CUDA and cuDNN can not be found automatically
    • Extended Theano flag dnn.enabled with new option no_check to help speed up cuDNN importation
    • Disallowed float16 precision for convolution gradients
    • Fixed memory alignment detection
    • Added profiling in C debug mode (with theano flag cmodule.debug=True)
    • Added Python scripts to help test cuDNN convolutions
    • Automatic addition of cuDNN DLL path to PATH environment variable on Windows
  • Updated float16 support
    • Added documentation for GPU float16 ops
    • Support float16 for GpuGemmBatch
    • Started to use float32 precision for computations that don’t support float16 on GPU
New features:
  • Implemented truncated normal distribution with box-muller transform
  • Added L_op() overriding option for OpFromGraph
  • Added NumPy C-API based fallback implementation for [sd]gemv_ and [sd]dot_
  • Implemented topk and argtopk on CPU and GPU
  • Implemented max() and min() functions for booleans and unsigned integers types
  • Added tensor6() and tensor7() in theano.tensor module
  • Added boolean indexing for sub-tensors
  • Added covariance matrix function theano.tensor.cov
  • Added a wrapper for Baidu’s CTC cost and gradient functions
  • Added scalar and elemwise CPU ops for modified Bessel function of order 0 and 1 from scipy.special
  • Added Scaled Exponential Linear Unit (SELU) activation
  • Added sigmoid_binary_crossentropy function
  • Added tri-gamma function
  • Added unravel_index and ravel_multi_index functions on CPU
  • Added modes half and full for Images2Neibs ops
  • Implemented gradient for AbstractBatchNormTrainGrad
  • Implemented gradient for matrix pseudoinverse op
  • Added new prop replace for ChoiceFromUniform op
  • Added new prop on_error for CPU Cholesky op
  • Added new Theano flag deterministic to help control how Theano optimize certain ops that have deterministic versions. Currently used for subtensor Ops only.
  • Added new Theano flag cycle_detection to speed-up optimization step by reducing time spending in inplace optimizations
  • Added new Theano flag check_stack_trace to help check the stack trace during optimization process
  • Added new Theano flag cmodule.debug to allow a debug mode for Theano C code. Currently used for cuDNN convolutions only.
  • Added new Theano flag pickle_test_value to help disable pickling test values
Others:
  • Kept stack trace for optimizations in new GPU backend
  • Added deprecation warning for the softmax and logsoftmax vector case
  • Added a warning to announce that C++ compiler will become mandatory in next Theano release 0.11
  • Added R_op() for ZeroGrad
  • Added description for rnnblock
Other more detailed changes:
  • Fixed invalid casts and index overflows in theano.tensor.signal.pool
  • Fixed gradient error for elemwise minimum and maximum when compared values are the same
  • Fixed gradient for ARange
  • Removed ViewOp subclass during optimization
  • Removed useless warning when profile is manually disabled
  • Added tests for abstract conv
  • Added options for disconnected_outputs to Rop
  • Removed theano/compat/six.py
  • Removed COp.get_op_params()
  • Support of list of strings for Op.c_support_code(), to help not duplicate support codes
  • Macro names provided for array properties are now standardized in both CPU and GPU C codes
  • Moved all C code files into separate folder c_code in every Theano module
  • Many improvements for Travis CI tests (with better splitting for faster testing)
  • Many improvements for Jenkins CI tests: daily testings on Mac and Windows in addition to Linux
Commiters since 0.9.0:
  • Frederic Bastien
  • Steven Bocco
  • João Victor Tozatti Risso
  • Arnaud Bergeron
  • Mohammed Affan
  • amrithasuresh
  • Pascal Lamblin
  • Reyhane Askari
  • Alexander Matyasko
  • Shawn Tan
  • Simon Lefrancois
  • Adam Becker
  • Vikram
  • Gijs van Tulder
  • Faruk Ahmed
  • Thomas George
  • erakra
  • Andrei Costinescu
  • Boris Fomitchev
  • Zhouhan LIN
  • Aleksandar Botev
  • jhelie
  • xiaoqie
  • Tegan Maharaj
  • Matt Graham
  • Cesar Laurent
  • Gabe Schwartz
  • Juan Camilo Gamboa Higuera
  • Tim Cooijmans
  • Anirudh Goyal
  • Saizheng Zhang
  • Yikang Shen
  • vipulraheja
  • Florian Bordes
  • Sina Honari
  • Chiheb Trabelsi
  • Shubh Vachher
  • Daren Eiri
  • Joseph Paul Cohen
  • Laurent Dinh
  • Mohamed Ishmael Diwan Belghazi
  • Jeff Donahue
  • Ramana Subramanyam
  • Bogdan Budescu
  • Dzmitry Bahdanau
  • Ghislain Antony Vaillant
  • Jan Schlüter
  • Nan Jiang
  • Xavier Bouthillier
  • fo40225
  • mrTsjolder
  • wyjw
  • Aarni Koskela
  • Adam Geitgey
  • Adrian Keet
  • Adrian Seyboldt
  • Anmol Sahoo
  • Chong Wu
  • Holger Kohr
  • Jayanth Koushik
  • Lilian Besson
  • Lv Tao
  • Michael Manukyan
  • Murugesh Marvel
  • NALEPA
  • Rebecca N. Palmer
  • Zotov Yuriy
  • dareneiri
  • lrast
  • morrme
  • naitonium